A clustering-based feature enhancement method for short-term natural gas consumption forecasting
Jinyuan Liu,Shouxi Wang,Nan Wei,Weibiao Qiao,Ze Li,Fanhua Zeng
DOI: https://doi.org/10.1016/j.energy.2023.128022
IF: 9
2023-09-01
Energy
Abstract:Natural gas consumption forecasting is crucial for planning and operating of sustainable energy systems. The accuracy of consumption forecasting is significantly affected by the quality of the collected features. Previous feature clustering methods, such as K-means and Gaussian mixed model (GMM), ignore the interference of factors with weak correlation on the clustering effect and thus fail to extract key information from the collected features. This paper proposes a novel feature enhancement method, namely, Gaussian correlation mixed clustering (GCMC), to extract fluctuation patterns from the highest correlation factors and divides the original sequence into multiple clusters to enhance the feature quality while reducing the complex fluctuation. Among them, correlation coefficient analysis, GMM, Bayesian information criterion and an improved information evaluation method are combined to cluster the selected highest correlation feature based on fluctuation patterns and evaluate the enhancement effect of feature quality. Then, each of the divided clusters is regarded as an independent dataset of the long short-term memory (LSTM) model for parallel forecasting and the results are restored to the structure of original sequence. In our experiments, we design four real-life datasets with different complexities. The results reveal that the proposed method outperforms GMM in terms of information entropy and accuracy. The information entropy for evaluating feature quality is improved by 6.13–9.66%. In comparison with other classic forecasting models, the mean absolute range normalized error (MARNE) of GCMC-LSTM for Karditsa, Thessaloniki, Oinofyta and Salfa Anthoussa are 6.06%, 4.62%, 14.18% and 15.30%, respectively, which presents the best performance and robustness. Especially for datasets with high complexity, by introducing GCMC, the MARNE is improved by 32.72% in Oinofyta.
energy & fuels,thermodynamics