Effective Voting-Based Ensemble Learning for Segregated Load Forecasting With Low Sampling Data

Shahzeb Ahmad Khan,Attique Ur Rehman,Ammar Arshad,Mohammed H. Alqahtani,Karar Mahmoud,Matti Lehtonen
DOI: https://doi.org/10.1109/access.2024.3413679
IF: 3.9
2024-06-21
IEEE Access
Abstract:In power system planning and operation, load forecasting is an important task as it helps ensure a reliable and efficient electricity supply. For effective operation of the smart grid, load forecasting is also an important thing to keep balancing dispatch of power, load management, and load shifting. In this regard, this paper aims to propose an accurate load forecasting based on implementing and integrating different load forecasting models using standalone machine learning and ensemble machine learning models, particularly for segregated real-world load data. In the given context, machine learning models namely, k-nearest neighbor, random forest, decision tree, and voting ensemble regression, are used in this study. The time series load data for this research work was acquired from a real-world load database namely, Pecan Street Dataport. For performance evaluation, two statistical error matrices are used, i.e., mean absolute error (MAE) and mean squared error (MSE). For simulation purposes, Python along with different machine-learning libraries was employed. Moreover, for numerical data analysis and visualization, this research work utilizes different packages like NumPy, pandas, and matplotlib. The empirical study presents the comparative performance analysis of machine learning models for load forecasting utilizing low sampling load data, both at aggregated as well as at segregated levels. Standalone and ensemble machine learning algorithms yield very good forecasting results, and this research has revealed that machine learning models trained on segregated data exhibit superior performance compared to those trained on aggregated data. On segregated data, the proposed voting- based ensemble machine learning algorithm outperforms all the other models with MAE 0.05708, followed by k-nearest neighbors (with MAE 0.05879), random forest (with MAE 0.07069), and decision tree (with MAE 0.07361).
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?