Hourly PM2.5 concentration forecasting based on mode decomposition-recombination technique and ensemble learning approach in severe haze episodes of China

Wei Sun,Zhaoqi Li
DOI: https://doi.org/10.1016/j.jclepro.2020.121442
IF: 11.1
2020-08-01
Journal of Cleaner Production
Abstract:<p>Accurate prediction of PM<sub>2.5</sub> concentration for its monitoring and early warning has become the major concern in terms of frequent occurrences of severe haze episodes in China. In this paper, a novel model for hourly PM<sub>2.5</sub> concentration prediction during severe haze episodes is suggested, where mode decomposition-recombination technique and ensemble learning approach are innovatively introduced into high concentration PM<sub>2.5</sub> forecasting. Firstly, the fast ensemble empirical mode decomposition (FEEMD) is applied to decompose the original complex non-stationary data into several modes with different frequencies to offset the impact of data noise. The sample entropy (SE) is used to further recombine similar modes aiming at avoiding excessive-decomposition to promote accurate information extraction and computational efficiency. Next, the partial autocorrelation analysis (PACF) is executed on the recombined modes to select appropriate input features of the forecasting model. Finally, the Stacking-driven ensemble model (SDEM) is developed as the forecasting model to enhance the feature representation and information utilization capacities in which K-fold cross-validation is carried out in each base-model to enhance the generalization performance. Outputs of each base-models are new input for the meta-model to acquire ultimate superior forecasting values. The empirical results of Baoding during autumn and winter illustrate that: (1) the proposed model exhibits effectiveness and robustness in PM<sub>2.5</sub> concentration prediction, particularly, these superiorities will not be weakened even at extremely high concentration points of PM<sub>2.5</sub>; (2) the decomposition-recombination technique is potential to handle nonlinear and high volatility data; (3) the ensemble learning approach can inherit and integrate merits of single models; (4) the developed FEEMD-SE-SDEM model outperforms other contrast models in terms of forecasting accuracy, stability and class prediction correct rate, which is promising in early air quality warning system.</p>
environmental sciences,green & sustainable science & technology,engineering, environmental
What problem does this paper attempt to address?