Satellite-based Prediction of Daily SO2 Exposure Across China Using a High-Quality Random Forest-Spatiotemporal Kriging (RF-STK) Model for Health Risk Assessment

Rui Li,Lulu Cui,Ya Meng,Yilong Zhao,Hongbo Fu
DOI: https://doi.org/10.1016/j.atmosenv.2019.03.029
IF: 5
2019-01-01
Atmospheric Environment
Abstract:China has been suffered from the severe sulfur dioxide (SO2) pollution in the past decades. The spatiotemporal estimation and health effect assessment of SO2 using two-stage machine learning models have not been performed yet. In this study, a high-quality model named random forest coupled with spatiotemporal Kriging (RF-STK) model was developed to estimate the daily SO2 concentration across the entire China from May 2014 to May 2015 based on the satellite data and geographic covariates. Compared with other statistical methods, the RF-STK model showed the better explanatory performance, with the 10-fold cross-validation R2 = 0.62 (root-mean-square error (RMSE) = 10.36 μg/m3) for daily estimations. The annually mean population-weighted SO2 concentration was estimated to be 30.49 ± 10.83 μg/m3 (mean ± standard deviation). The SO2 levels displayed the remarkably seasonal variation with the peak in winter (47.27 ± 22.64 μg/m3), followed by ones in autumn (28.41 ± 10.41 μg/m3) and spring (25.92 ± 7.95 μg/m3), and in summer (21.33 ± 6.51 μg/m3). At the national scale, only 20.31% of the population lived in the safe regions (population-weighted SO2 concentration < 20 μg/m3). The higher population-weighted SO2 concentrations were mainly concentrated on some provinces of North China Plain (NCP) (e.g., Shanxi, Hebei, Shandong), followed by the provinces of Northeast China, and the lowest one in Hainan (8.31 ± 1.38 μg/m3). The mean all-cause mortalities due to excessive SO2 exposure were estimated to be 131,957 cases, accounting for 0.009% of the whole Chinese population. Among all of the diseases, the mortalities per year were in the order of respiratory disease (RD) (11913 cases) > cardiovascular disease (CVD) (11386 cases) > chronic obstructive pulmonary disease (COPD) (8112 cases) > cerebrovascular disease (CEVD) (2188 cases). The statistical modelling of SO2 at a national scale provided the valuable data for epidemiological research and air pollution prevention.
What problem does this paper attempt to address?