Spatiotemporally continuous PM 2.5 dataset in the Mekong River Basin from 2015 to 2022 using a stacking model

Debao Chen,Xingfa Gu,Hong Guo,Tianhai Cheng,Jian Yang,Yulin Zhan,Qiming Fu
DOI: https://doi.org/10.1016/j.scitotenv.2023.169801
IF: 9.8
2024-01-11
The Science of The Total Environment
Abstract:With the potential to cause millions of deaths, PM 2.5 pollution has become a global concern. In Southeast Asia, the Mekong River Basin (MRB) is experiencing heavy PM 2.5 pollution and the existing PM 2.5 studies in the MRB are limited in terms of accuracy and spatiotemporal coverage. To achieve high-accuracy and long-term PM 2.5 monitoring of the MRB, fused aerosol optical depth (AOD) data and multi-source auxiliary data are fed into a stacking model to estimate PM 2.5 concentrations. The proposed stacking model takes advantage of convolutional neural network (CNN) and Light Gradient Boosting Machine (LightGBM) models and can well represent the spatiotemporal heterogeneity of the PM 2.5 -AOD relationship. In the cross-validation (CV), comparison with CNN and LightGBM models shows that the stacking model can better suppress overfitting, with a higher coefficient of determination (R 2 ) of 0.92, a lower root mean square error (RMSE) of 5.58 μg/m 3 , and a lower mean absolute error (MAE) of 3.44 μg/m 3 . For the first time, the high-accuracy PM 2.5 dataset reveals spatially and temporally continuous PM 2.5 pollution and variations in the MRB from 2015 to 2022. Moreover, the spatiotemporal variations of annual and monthly PM 2.5 pollution are also investigated at the regional and national scales. The dataset will contribute to the analysis of the causes of PM 2.5 pollution and the development of mitigation policies in the MRB.
environmental sciences
What problem does this paper attempt to address?