Clustered Factor Analysis for Multivariate Spatial Data

Yanxiu Jin,Tomoya Wakayama,Renhe Jiang,Shonosuke Sugasawa
DOI: https://doi.org/10.48550/arXiv.2409.07018
2024-09-11
Abstract:Factor analysis has been extensively used to reveal the dependence structures among multivariate variables, offering valuable insight in various fields. However, it cannot incorporate the spatial heterogeneity that is typically present in spatial data. To address this issue, we introduce an effective method specifically designed to discover the potential dependence structures in multivariate spatial data. Our approach assumes that spatial locations can be approximately divided into a finite number of clusters, with locations within the same cluster sharing similar dependence structures. By leveraging an iterative algorithm that combines spatial clustering with factor analysis, we simultaneously detect spatial clusters and estimate a unique factor model for each cluster. The proposed method is evaluated through comprehensive simulation studies, demonstrating its flexibility. In addition, we apply the proposed method to a dataset of railway station attributes in the Tokyo metropolitan area, highlighting its practical applicability and effectiveness in uncovering complex spatial dependencies.
Methodology
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that in multivariate spatial data analysis, the traditional factor analysis method cannot effectively capture the problem of spatial heterogeneity. Specifically, traditional factor analysis assumes that the relationships between variables are homogeneous, which is usually not true in geographical data. According to "Tobler's First Law of Geography", everything in space is related, but this correlation weakens as the distance increases. Therefore, when performing factor analysis on multivariate spatial data, it cannot well capture the correlation of spatial heterogeneity between variables. To solve this problem, the author proposes a new method - Spatially Clustered Factor Analysis (SCFA). This method combines spatial clustering and factor analysis and aims to discover the underlying dependency structures in multivariate spatial data. SCFA assumes that spatial locations can be approximately divided into a limited number of clusters, and locations within the same cluster share a similar dependency structure. By combining the spatial clustering algorithm and factor analysis, SCFA can simultaneously detect spatial clusters and estimate the unique factor model for each cluster, thus dealing with spatial heterogeneity more effectively. This method can not only better explain the complex dependency relationships in multivariate spatial data, but also demonstrate its effectiveness and practicality in practical applications (such as the applied research on the railway station attribute data set in the Tokyo metropolitan area). Through this method, researchers can gain a deeper understanding of the influence of different geographical locations on data dependency relationships, providing strong support for research in fields such as environmental science, sociology, epidemiology, and urban planning.