PSO-Stacking Improved Ensemble Model for Campus Building Energy Consumption Forecasting Based on Priority Feature Selection

Yisheng Cao,Gang Liu,Jian Sun,Durga Prasad Bavirisetti,Gang Xiao
DOI: https://doi.org/10.1016/j.jobe.2023.106589
IF: 7.144
2023-01-01
Journal of Building Engineering
Abstract:Building energy consumption forecasting plays an indispensable role in energy resource management and scheduling. When using an ensemble forecasting model, it is difficult to determine the optimal combination of parameters for integrating the algorithm. Aiming at this problem, a Particle Swarm Optimization-Stacking Improved Ensemble (PStIE) model is proposed for improving the Stacking ensemble model. Composed of 11 Machine Learning (ML) algorithms in the regressor pool, the Particle Swarm Optimization (PSO) algorithm is used to find the optimal combination of base models and a meta-model in Stacking. Meanwhile, a Priority Feature Selection (PFS) method is proposed. Different from the previous single feature selection algorithm, PFS integrates the feature ranking of three feature selection algorithms, calculates the priority coefficient of the features, and selects features with the smallest priority coefficients as candidate feature sets. In addition, when the number of training features of a traditional Stacking model reaches “saturation”, adding more features does not much improve the accuracy of forecasting, even if the training time is increased. Due to the above problems, the PFS method is used to perform feature fusion in the second layer of the PSO-Stacking framework. To evaluate the proposed framework, experiments are conducted using the dataset of hourly electricity consumption of a campus building located in Cambridge, Massachusetts, USA. The experimental results show that the RMSE value of the PSO-Stacking framework is 1.71 lower than that of the commonly used ML algorithms. As a part of the ablation study, when setting different numbers for the feature selection, the PFS method can always choose the best or second-best feature combination. After the features selected by the PFS method are used for subsequent feature fusion, the RMSE score of the PStIE model is 2.62 lower than that without feature fusion.
What problem does this paper attempt to address?