Estimating Daily Ground-Level NO2 Concentrations over China Based on TROPOMI Observations and Machine Learning Approach

Shuiju Long,Xiaoli Wei,Feng Zhang,Renhe Zhang,Jian Xu,Kun Wu,Qingqing Li,Wenwen Li
DOI: https://doi.org/10.1016/j.atmosenv.2022.119310
IF: 5
2022-01-01
Atmospheric Environment
Abstract:Nitrogen dioxide (NO2) is an important target for monitoring atmospheric quality. Deriving ground-level NO2 concentrations with much finer resolution, it requires high-resolution satellite tropospheric NO2 column as input and a reliable estimation algorithm. This paper aims to estimate the daily ground-level NO2 concentrations over China based on machine learning models and the TROPOMI NO2 data with high spatial resolution. In this study, four tree-based algorithm machine learning models, decision trees (DT), gradient boost decision tree (GBDT), random forest (RF) and extra-trees (ET), were used to estimate ground-level NO2 concentrations. In addition to considering many influencing factors of the ground-level NO2 concentrations, we especially introduced simplified temporal and spatial information into the estimation models. The results show that the extra-trees with spatial and temporal information (ST-ET) model has great performance in estimating ground-level NO2 concentrations with a cross-validation R-2 of 0.81 and RMSE of 3.45 mu g/m(3) in test datasets. The estimated results for 2019 based on the ST-ET model achieves a satisfactory accuracy with a cross-validation R-2 of 0.86 compared with the other models. Through time-space analysis and comparison, it was found that the estimated high-resolution results were consistent with the ground observed NO2 concentrations. Using data from January 2020 to test the prediction power of the models, the results indicate that the ST-ET model has a good performance in predicting ground-level NO2 concentrations. Taking four ground-level NO2 concentrations hotspots as examples, the estimated ground-level NO2 concentrations and ground-based observation data during the coronavirus disease (COVID-19) pandemic were lower compared with the same period in 2019. The findings offer a solid solution for accurately and efficiently estimating ground-level NO2 concentrations by using satellite observations, and provide useful information for improving our understanding of the regional atmospheric environment.
What problem does this paper attempt to address?