Developing an annual building volume dataset at 1-km resolution from 2001 to 2019 in China

Wenting Yan,Jianping Wu,Chaoqun Zhang,Xiuzhi Chen,Jiashun Ren,Zhenzhen Xiao,Ziyin Liao,Raffaele Lafortezza,Xueyan Li,Yongxian Su,Wenting YanJianping WuChaoqun ZhangXiuzhi ChenJiashun RenZhenzhen XiaoZiyin LiaoRaffaele LafortezzaXueyan LiYongxian Sua Guangdong Province Data Center of Terrestrial and Marine Ecosystems Carbon Cycle,School of Atmospheric Sciences,Sun Yat-sen University (Zhuhai),Zhuhai,People's Republic of Chinab Guangdong Provincial Key Lab of Remote Sensing and Geographical Information System,Guangdong Open Laboratory of Geospatial Information Technology and Application,Guangzhou Institute of Geography,Guangdong Academy of Sciences,Guangzhou,People's Republic of Chinac Department of Agricultural and Environmental Sciences,University of Bari 'A. Moro',Bari,Italy
DOI: https://doi.org/10.1080/17538947.2024.2330690
IF: 4.606
2024-03-25
International Journal of Digital Earth
Abstract:Urban vertical features are crucial for understanding urban morphology. However, long-term information on three-dimensional buildings, which are important fundamental data for studying on the historical urbanization processes, remains scarce in China. In this study, we proposed a Random Forest model to generate an annual 1-km resolution building volume dataset covering mainland China from 2001 to 2019, by integrating the nighttime light data, population demographics, electricity consumption records, carbon dioxide emissions data, and various optical and statistical datasets. This new building volume data are highly consistent with that derived from Baidu Maps on 1-km scale, with Pearson's correlation coefficient (R) of 0.847, root mean square error (RMSE) of 9.17 × 10 5 m 3 /km 2 and mean absolute error (MAE) of 5.86 × 10 5 m 3 /km 2 . Notably, cross-validation indicate that the blooming problem was greatly improved when compared with previous model-based building three-dimensional data. The proposed method holds significant advantages, benefiting form low-cost implementation based on free open-source data and providing extendable algorithm to estimate the 3D shape of cities in the future. The time-series historical building volume data offer comprehensive insights into the historical development of urban structures, and provide valuable fundmental data for future urban planning, urban climate models and land use projections.
geography, physical,remote sensing
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address the lack of long-term historical data on the 3D features of Chinese cities, particularly building volumes. Specifically, the research goal is to rapidly and automatically generate an annual building volume dataset (including both horizontal and vertical features) covering all cities in mainland China from 2001 to 2019, based on free and open-source data. The researchers combined building volume data for 75 cities in 2019 provided by Baidu Maps with 17 other environmental covariate datasets (covering 2001 to 2019) and used a random forest machine learning model to generate this dataset. #### Main Contributions: 1. **Dataset Coverage**: Generated an annual building volume dataset with 1-kilometer resolution from 2001 to 2019. 2. **Methodological Innovation**: Employed a random forest model, integrating various environmental variable data to generate large-scale 3D urban data at a low cost. 3. **Data Reliability Verification**: Verified the accuracy of the generated dataset by comparing it with Baidu Maps data. 4. **Application Value**: Provided essential foundational data for future urban planning, urban climate models, and land use predictions. #### Specific Implementation Steps: 1. **Data Collection and Preprocessing**: Collected building vector data from Baidu Maps and various environmental covariate data, and resampled them uniformly to a 1-kilometer resolution. 2. **Model Construction**: Used a random forest model, with Baidu Maps data as the dependent variable and environmental covariate data as the independent variables to build the model. 3. **Result Evaluation**: Assessed the accuracy of the generated dataset through cross-validation methods. 4. **Data Analysis**: Conducted spatiotemporal characteristic analysis on the generated dataset. In summary, this paper fills the historical gap in 3D feature data of Chinese cities through innovative methods, providing crucial foundational data support for urban science research.