A divide-and-conquer approach for spatio-temporal analysis of large house price data from Greater London

Kapil Gupta,Soudeep Deb
DOI: https://doi.org/10.48550/arxiv.2407.15905
2024-07-22
Applications
Abstract:Statistical research in real estate markets, particularly in understanding the spatio-temporal dynamics of house prices, has garnered significant attention in recent times. Although Bayesian methods are common in spatio-temporal modeling, standard Markov chain Monte Carlo (MCMC) techniques are usually slow for large datasets such as house price data. To tackle this problem, we propose a divide-and-conquer spatio-temporal modeling approach. This method involves partitioning the data into multiple subsets and applying an appropriate Gaussian process model to each subset in parallel. The results from each subset are then combined using the Wasserstein barycenter technique to obtain the global parameters for the original problem. The proposed methodology allows for multiple observations per spatial and time unit, thereby offering added benefits for practitioners. As a real-life application, we analyze house price data of more than 0.6 million transactions from 983 middle layer super output areas in London over a period of eight years. The methodology provides insightful findings about the effects of various amenities, trend patterns, and the relationship between prices and carbon emissions. Furthermore, as demonstrated through a cross-validation study, it shows good predictive accuracy while balancing computational efficiency.
What problem does this paper attempt to address?