A New Hybrid Machine Learning Model for Short-Term Climate Prediction by Performing Classification Prediction and Regression Prediction Simultaneously

Li Deqian,Hu Shujuan,Guo Jinyuan,Wang Kai,Gao Chenbin,Wang Siyi,He Wenping
DOI: https://doi.org/10.1007/s13351-022-1214-3
2022-01-01
Abstract:Machine learning methods are effective tools for improving short-term climate prediction. However, commonlyused methods often carry out classification and regression prediction modeling separately and independently. Such asingle modeling approach may obtain inconsistent prediction results in classification and regression and thus may notmeet the needs of practical applications well. To address this issue, this study proposes a selective Naive Bayes ensemblemodel (SENB-EM) by introducing causal effect and voting strategy on Naive Bayes. The new model can notonly screen effective predictors but also perform classification and regression prediction simultaneously. After beingapplied to the area prediction of summer western North Pacific subtropical high (WNPSH) from 2008 to 2021, it isfound that the accuracy classification score (a metric to assess the overall classification prediction accuracy) and thetime correlation coefficient (TCC) of SENB-EM can reach 1.0 and 0.81, respectively. After integrating the results ofdifferent models [including multiple linear regression ensemble model (MLR-EM), SENB-EM, and Chinese MultimodelEnsemble Prediction System (CMME) used by National Climate Center (NCC)] for 2017–2021, the TCC ofthe ensemble results of SENB-EM and CMME can reach 0.92 (the highest result among them). This indicates that theprediction results of the summer WNPSH area provided by SENB-EM have a high reference value for the real-timeprediction. It is worth noting that, except for the numerical prediction results, the SENB-EM model can also give therange of numerical prediction intervals and predictions for anomalous degrees of the WNPSH area, thus providingmore reference information for meteorological forecasters. Overall, as a new hybrid machine learning model, theSENB-EM has a good prediction ability; the approach of performing classification prediction and regression predictionsimultaneously through integration is informative to short-term climate prediction.
What problem does this paper attempt to address?