Real Estate Property Valuation using Self-Supervised Vision Transformers

Mahdieh Yazdani,Maziar Raissi
DOI: https://doi.org/10.48550/arXiv.2302.00117
2023-02-01
Abstract:The use of Artificial Intelligence (AI) in the real estate market has been growing in recent years. In this paper, we propose a new method for property valuation that utilizes self-supervised vision transformers, a recent breakthrough in computer vision and deep learning. Our proposed algorithm uses a combination of machine learning, computer vision and hedonic pricing models trained on real estate data to estimate the value of a given property. We collected and pre-processed a data set of real estate properties in the city of Boulder, Colorado and used it to train, validate and test our algorithm. Our data set consisted of qualitative images (including house interiors, exteriors, and street views) as well as quantitative features such as the number of bedrooms, bathrooms, square footage, lot square footage, property age, crime rates, and proximity to amenities. We evaluated the performance of our model using metrics such as Root Mean Squared Error (RMSE). Our findings indicate that these techniques are able to accurately predict the value of properties, with a low RMSE. The proposed algorithm outperforms traditional appraisal methods that do not leverage property images and has the potential to be used in real-world applications.
Computer Vision and Pattern Recognition,Artificial Intelligence,Machine Learning,Econometrics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to use Self - Supervised Vision Transformers to improve the accuracy of real - estate valuation. Specifically, the author proposes a new method. By combining machine learning, computer vision and hedonic pricing models, it uses a data set containing internal and external images of properties as well as quantitative features to estimate property values. This method aims to overcome the limitations of traditional evaluation methods that only rely on structural factors and socio - economic conditions while ignoring the influence of the visual appearance of houses on buyers' decisions. By introducing self - supervised learning techniques, the model can learn useful features from a large amount of unlabeled image data, thus achieving higher accuracy in predicting housing prices, especially for those properties where the visual appearance has a significant impact on the value.