Estimating High-Resolution PM2.5 Concentration in the Sichuan Basin Using a Random Forest Model with Data-Driven Spatial Autocorrelation Terms

Yi Zhang,Siwei Zhai,Jingfei Huang,Xuelin Li,Wei Wang,Tao Zhang,Fei Yin,Yue Ma
DOI: https://doi.org/10.1016/j.jclepro.2022.134890
IF: 11.1
2022-01-01
Journal of Cleaner Production
Abstract:The Sichuan Basin (SCB) is severely polluted by fine particulate matter (PM2.5). Accurate PM2.5 concentration is important for pollution control and epidemiological studies. Evidence indicates that the distribution of PM2.5 is spatially clustered. Additionally, the high local variation in PM2.5 in densely populated areas indicates the ne-cessity of high-resolution PM2.5 estimation. However, spatial clustering and local variation are not considered in current studies in the SCB, which may limit the prediction accuracy of PM2.5 estimation. In this study, we estimated the PM2.5 concentration at 0.01 degrees (approximately 1 km) resolution using a random forest model with data-driven spatial autocorrelation terms (DDW-RF) considering both the first-law-of-geography-based similarity and spatial clustering of PM2.5. The repeated 10-fold cross-validations revealed that compared to the traditional RF model, the optimal model had an 18.31% decrease in the root mean square error (RMSE) and a 4.68% in-crease in the coefficient of determination (R2). The distribution of PM2.5 revealed another heavily polluted area in the northeastern SCB, including Nanchong and Dazhou besides the two commonly known heavily PM2.5 polluted areas in the western and southern SCB. Then, we built a downscaled model in the megacity Chengdu, which estimated PM2.5 at 0.001 degrees resolution with a 0.156 mu g/m3 (1.72%) decrease in RMSE compared to those of 0.01 degrees estimations. The accurate and high-resolution PM2.5 estimates generated by DDW-RF and downscaled models in this study could be beneficial for accurate health effect estimation not only in the whole SCB but also in the city areas with high variable concentrations.
What problem does this paper attempt to address?