Research on multiscale OpenStreetMap in China: data quality assessment with EWM-TOPSIS and GDP modeling
Chuqiao Han Binbin Lu Jianghua Zheng Danlin Yu Shudan Zheng a College of Geography and Remote Sensing Sciences,Xinjiang University,Urumqi,Chinab School of Remote Sensing and Information Engineering,Wuhan University,Wuhan,Chinac Xinjiang Key Laboratory of Oasis Ecology,Xinjiang University,Urumqi,Chinad Department of Earth and Environmental Studies,Montclair State University,Montclair,NJ,USAe Geodata division,Lantmäteriet,Luleå,SwedenChuqiao Han is currently a Ph.D. candidate at Xinjiang University. He focuses on research in spatiotemporal big data and geographic information system modeling.Binbin Lu currently holds the position of associate professor at Wuhan University. His research focuses on geocomputation,spatial statistics,geographically weighted (GW) modeling,open-source GIS,R coding,and spatio-temporal big data analysis.Jianghua Zheng is a Professor of the College of Geographical and Remote Sensing Science at Xinjiang University. His research interests include vegetation and environmental remote sensing,monitoring and safety assessment of forest and grassland ecosystems,as well as research in remote sensing and geographic information systems applications.Danlin Yu is a professor at Montclair State University. His research interests are geographic information analysis,cartographical design and presentation,statistical analysis,urban and regional planning,and system dynamic modeling.Shudan Zheng received his Master's degree from Xinjiang University and is currently employed as an engineer at Lantmäteriet. Her research focus is on geographic information system development and applications.
DOI: https://doi.org/10.1080/10095020.2024.2356238
IF: 4.278
2024-06-11
Geo-spatial Information Science
Abstract:OpenStreetMap (OSM) is a voluntary platform designed to provide free and up-to-date geographic data. Since OSM is based on multisource geographic data provided by the public, the quality of the data has become a concern of researchers. In this study, a unified measure for evaluating OSM data quality was constructed, and the entropy weight method (EWM) and Technique for Order Performance by Similarity to Ideal Solution (TOPSIS) model were used to evaluate the quality of multiscale OSM data in China from 2014 to 2020. In addition to evaluating the data quality, the use of OSM data quality index factors in economic modeling at different spatial scales in China was explored by using a geographic information system (GIS) analysis method and a geographically and temporally weighted regression (GTWR) model. Four machine learning models, SVM, RF, XGBoost and CatBoost, were used to simulate the grid-scale GDP, and the effectiveness of these simulations was discussed. The results showed that (1) the weights of OSM data quality indicator factors vary across different spatial scales. (2) From 2014 to 2020, the quality of national-scale OSM data first increased, then decreased and then gradually stabilized. In addition, the quality of OSM data at the provincial and municipal scales is significantly different, and the distribution is affected by the population and geographical environment. (3) Over time, the spatial clustering characteristics of OSM data quality at different spatial scales in China has continuously strengthened. In addition, the quality of Chinese OSM data displays obvious local spatial autocorrelation characteristics, which are dominated by H-H clustering and L-L clustering. (4) The GTWR model performs well in predicting and revealing the spatiotemporal correlation characteristics between GDP and OSM data quality indicators at the provincial and municipal scales in China. The correlations increase with decreasing spatial scale (provincial to municipal). Moreover, the GDP modeling ability is better in economically underdeveloped Northwest China and economically developed East China. (5) Four machine learning models coupled with road network length completeness, relative linear density, road name attribute completeness, POI name attribute completeness, road network accuracy, road network update frequency, POI update frequency, topological consistency and directional similarity yielded the best grid-scale GDP simulation values. Notably, the CatBoost model provided the best accuracy, and the results further verified that it is feasible to use the proposed OSM data quality index system to predict the regional economic development level in China.
remote sensing