Clustering-based Multitasking Deep Neural Network for Solar Photovoltaics Power Generation Prediction

Hui Song,Zheng Miao,Ali Babalhavaeji,Saman Mehrnia,Mahdi Jalili,Xinghuo Yu
2024-05-14
Abstract:The increasing installation of Photovoltaics (PV) cells leads to more generation of renewable energy sources (RES), but results in increased uncertainties of energy scheduling. Predicting PV power generation is important for energy management and dispatch optimization in smart grid. However, the PV power generation data is often collected across different types of customers (e.g., residential, agricultural, industrial, and commercial) while the customer information is always de-identified. This often results in a forecasting model trained with all PV power generation data, allowing the predictor to learn various patterns through intra-model self-learning, instead of constructing a separate predictor for each customer type. In this paper, we propose a clustering-based multitasking deep neural network (CM-DNN) framework for PV power generation prediction. K-means is applied to cluster the data into different customer types. For each type, a deep neural network (DNN) is employed and trained until the accuracy cannot be improved. Subsequently, for a specified customer type (i.e., the target task), inter-model knowledge transfer is conducted to enhance its training accuracy. During this process, source task selection is designed to choose the optimal subset of tasks (excluding the target customer), and each selected source task uses a coefficient to determine the amount of DNN model knowledge (weights and biases) transferred to the aimed prediction task. The proposed CM-DNN is tested on a real-world PV power generation dataset and its superiority is demonstrated by comparing the prediction performance on training the dataset with a single model without clustering.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the uncertainty in photovoltaic (PV) power generation prediction. With the increase in the installation of photovoltaic cells, although the generation of renewable energy has increased, this has also led to an increase in uncertainty in energy scheduling. Accurate prediction of PV power generation is crucial for energy management and scheduling optimization in smart grids. However, PV power data are usually collected from different types of customers (such as residential, agricultural, industrial, and commercial), and the type information of these customers is usually anonymous. This results in a prediction model needing to handle all PV power data instead of building separate predictors for each customer type. To solve this problem, the authors propose a clustering - based multi - task deep neural network framework (CM - DNN) for PV power generation prediction. Specifically: 1. **Data Clustering**: Use the K - means algorithm to cluster the data into different customer types. 2. **Deep Neural Network Training**: Train a deep neural network (DNN) for each customer type until the prediction accuracy can no longer be improved. 3. **Cross - model Knowledge Transfer**: Conduct cross - model knowledge transfer between the trained models to enhance the training accuracy of the target task. The specific method is to select the optimal subset of source tasks and assign a coefficient to each selected source task to determine the amount of DNN model knowledge (weights and biases) to be transferred. 4. **Optimization Algorithm**: Use the particle swarm optimization (PSO) algorithm to find the optimal subset of source tasks and related coefficients. Through this method, the CM - DNN framework aims to improve the accuracy of PV power generation prediction, especially on data of different customer types. The paper verifies the effectiveness of CM - DNN by comparing it with several popular time - series prediction models (such as RNN, 1DCNN, LSTM, and GRU). The experimental results show that the prediction performance of CM - DNN on residential, agricultural, industrial, and commercial datasets is better than that of other models.