Missing Value Imputation for Multi-view Urban Statistical Data via Spatial Correlation Learning
Yongshun Gong,Zhibin Li,Jian Zhang,Wei Liu,Yilong Yin,Yu Zheng
DOI: https://doi.org/10.1109/tkde.2021.3072642
IF: 9.235
2021-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:As a developing trend of urbanization, massive amounts of urban statistical data with multiple views (e.g., views of Population and Economy) are increasingly collected and benefited to diverse domains, including transportation service, regional analysis, etc. Unfortunately, these statistical data that are divided into fine-grained regions usually suffer from missing value problem during the acquisition and storage processes. It is mianly caused by some inevitable circumstances, e.g., the document defacement, statistical difficulty in remote districts, and inaccurate information cleaning, etc. Those missing entries which make valuable information invisible may distort the further urban analysis. To improve the quality of missing data imputation, we propose an improved spatial multi-kernel learning method to guide the imputation process incorporating with the adaptive-weight non-negative matrix factorization strategy. Our model takes into account the regional latent similarities and the real geographical positions as well as the correlations among various views that are able to complete missing values precisely. We conduct intensive experiments to evaluate our method and compare with other state-of-the-art approaches on real-world datasets. All the empirical results show that the proposed model outperforms all the other state-of-the-art methods. Additionally, our model represents a strong generalization ability across multiple cities.
computer science, information systems, artificial intelligence,engineering, electrical & electronic