Abstract:The urban agglomeration on the north slope of the Tianshan Mountains is a pivotal place in Western China; it is essential for the economic growth of Xinjiang and acts as a critical bridge between China's interior and the Asia–Europe continent. Due to unique natural conditions, the local population distribution exhibits distinct regional characteristics. This study employs the spatial lag model (SLM) from conventional spatial analysis and the random forest model (RFM) from contemporary machine learning techniques. It integrates traditional geographic data, including land cover data and nighttime light data, with geographical big data, such as POI (points of interest) and OSM (OpenStreetMap), to build a comprehensive indicator database. Subsequently, it simulates the spatial population distribution within the urban agglomeration on the northern slopes of the Tianshan Mountains in 2020. The accuracy of the results is then compared and assessed against the accuracy of other available population raster datasets, and the spatial distribution pattern in 2020 is analyzed. The findings reveal the following: (1) The result of SLM, combined with multi-source data, predicts the population distribution as a relatively uniform and nearly circular structure, with minimal spatial differentiation. (2) The result of RFM, employing multi-source data, better captures the spatial population distribution, resulting in irregular boundaries that are indicative of strong spatial heterogeneity. (3) Both models demonstrate superior accuracy in simulating population distribution. The spatial lag model's accuracy surpasses that of the GHS and GPW datasets, albeit still trailing behind WorldPop and LandScan. Meanwhile, the random forest model significantly outperforms the four aforementioned population raster datasets. (4) The population spatial pattern in the urban agglomeration on the north slope of the Tianshan Mountains predominantly consists of four distinct circles, illustrating a "one axis, one center, and multiple focal points" distribution characteristic. Combining the random forest model with geographic big data for spatialized population simulation offers robust scientific validity and practicality. It holds potential for broader application within the urban agglomeration on the Tianshan Mountains and across Xinjiang. This study can offer insights for studies on regional population spatial distributions and inform sustainable development strategies for cities and their populations.

China’s Population Spatialization Based on Three Machine Learning Models

Improved Population Mapping for China Using Remotely Sensed and Points-of-interest Data Within a Random Forests Model.

Local Population Mapping Using a Random Forest Model Based on Remote and Social Sensing Data: A Case Study in Zhengzhou, China.

Improved Population Mapping for China Using the 3D Building, Nighttime Light, Points-of-Interest, and Land Use/Cover Data within a Multiscale Geographically Weighted Regression Model

Population Spatialization in Beijing City Based on Machine Learning and Multisource Remote Sensing Data

Using POI and multisource satellite datasets for mainland China's population spatialization and spatiotemporal changes based on regional heterogeneity

Population Mapping with Multisensor Remote Sensing Images and Point-Of-Interest Data.

Study on Spatialization and Spatial Pattern of Population Based on Multi-Source Data—A Case Study of the Urban Agglomeration on the North Slope of Tianshan Mountain in Xinjiang, China

A 100 m gridded population dataset of China's seventh census using ensemble learning and big geospatial data

A 100 m gridded population dataset of China’s seventh census using ensemble learning and big geospatial data

Mapping monthly population distribution and variation at 1-km resolution across China

Improved Estimates of Population Exposure in Low-Elevation Coastal Zones of China.

Fine-scale population mapping on Tibetan Plateau using the ensemble machine learning methods and multisource data

Spatial Population Distribution Data Disaggregation Based on SDGSAT-1 Nighttime Light and Land Use Data Using Guilin, China, as an Example

The role of prophylactic cholecystectomy versus deferral in the care of patients after endoscopic sphincterotomy.

Spatiotemporal dynamics of population density in China using nighttime light and geographic weighted regression method*

Fine-scale population spatialization data of China in 2018 based on real location-based big data

Unraveling near real-time spatial dynamics of population using geographical ensemble learning

Optimizing Urban Population Mapping Grid Size Selection Based on Random Forest Regression

Croplayer-China: A 2-Meter Resolution Cropland Map of China Based on Active Learning of Segmentation with Mapbox and Google Satellite Imagery

A Population Spatialization Model at the Building Scale Using Random Forest