A hybrid XGBoost-SMOTE model for optimization of operational air quality numerical model forecasts

Huabing Ke,Sunling Gong,Jianjun He,Lei Zhang,Jingyue Mo
DOI: https://doi.org/10.3389/fenvs.2022.1007530
IF: 5.411
2022-09-24
Frontiers in Environmental Science
Abstract:As a main technical tool, the air quality numerical model is widely used in the forecasts of atmospheric pollutants, and its development is of great significance to the atmospheric environment and human health. In this study, a hybrid XGBoost-SMOTE model has been developed and applied for the optimization of forecasted PM 2.5 and O 3 concentrations from the Chinese operational air quality forecasting model - CMA Unified Atmospheric Chemistry Environment model (CUACE), which automatically finds the optimal hyperparameters and features without human intervention. Supported by a knowledge base including the ground-observed, CUACE-forecasted pollutants and meteorological data as well as some auxiliary variables, and based on the evaluation analysis of 46 selected key national cities, it was found that the XGBoost-SMOTE model can achieve satisfactory optimization effects for the operational model, especially the significant improvement of the pollutant extreme values on high-pollution days. The results show that after optimization, the 5-day average correlation coefficient (R), mean error (ME) and root mean square error (RMSE) values can reach 0.87, 10.34 μg/m 3 and 16.53 μg/m 3 for PM 25 , and 0.89, 14.53 μg/m 3 and 18.83 μg/m 3 for O 3 , far better than those from original CUACE model and XGBoost model. Furthermore, the optimization of the spatial distribution of pollutants from the CUACE model and the impact analysis of the input features by the SHAP method were also explored. The developed hybrid model unveils a good application prospect in the field of environmental meteorology forecasts.
environmental sciences
What problem does this paper attempt to address?