Submodular Load Clustering with Robust Principal Component Analysis

Yishen Wang,Xiao Lu,Yiran Xu,Di Shi,Zhehan Yi,Jiajun Duan,Zhiwei Wang
DOI: https://doi.org/10.48550/arXiv.1902.07376
2019-02-20
Abstract:Traditional load analysis is facing challenges with the new electricity usage patterns due to demand response as well as increasing deployment of distributed generations, including photovoltaics (PV), electric vehicles (EV), and energy storage systems (ESS). At the transmission system, despite of irregular load behaviors at different areas, highly aggregated load shapes still share similar characteristics. Load clustering is to discover such intrinsic patterns and provide useful information to other load applications, such as load forecasting and load modeling. This paper proposes an efficient submodular load clustering method for transmission-level load areas. Robust principal component analysis (R-PCA) firstly decomposes the annual load profiles into low-rank components and sparse components to extract key features. A novel submodular cluster center selection technique is then applied to determine the optimal cluster centers through constructed similarity graph. Following the selection results, load areas are efficiently assigned to different clusters for further load analysis and applications. Numerical results obtained from PJM load demonstrate the effectiveness of the proposed approach.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: With the increase of distributed energy (such as photovoltaic power generation, electric vehicles and energy storage systems) and the introduction of demand - response mechanisms, load analysis in power systems is facing new challenges. Traditional load analysis methods are difficult to cope with the irregular load behaviors brought about by these changes. Specifically, although load areas at the transmission level show irregular behaviors in different regions, highly aggregated load curves still have similar characteristics. To solve this problem, the paper proposes an efficient submodular load clustering method to discover these inherent patterns and provide useful information for other load applications (such as load forecasting and load modeling). The following are the main steps of this method: 1. **Data Normalization**: The historical annual load curve of each load area is represented as a column vector \(x_i\in\mathbb{R}^{N_T}\) and is normalized to ensure computational stability and ease of comparison: \[ y_i=\frac{x_i - X_{\min,i}}{X_{\max,i}-X_{\min,i}} \] where \(X_{\max,i}\) and \(X_{\min,i}\) are the maximum and minimum load values of area \(i\), respectively. 2. **Robust Principal Component Analysis (R - PCA)**: R - PCA decomposes the normalized load data into a low - rank component \(L\) and a sparse component \(S\) to extract key features and effectively alleviate data quality problems (such as data corruption or missing). The form of its optimization problem is as follows: \[ \min\|\mathbf{L}\|_*+\mu\|\mathbf{S}\|_1\quad\text{s.t.}\quad\mathbf{L}+\mathbf{S}=\mathbf{M} \] where \(\|\mathbf{L}\|_*\) is the nuclear norm of matrix \(L\), \(\|\mathbf{S}\|_1\) is the \(\ell_1\)-norm of matrix \(S\), and the weight factor \(\mu\) is determined by the following formula: \[ \mu = \frac{1}{\sqrt{\max(N_T, N_I)}} \] 3. **Feature Extraction**: Extract features such as seasonal average load, seasonal standard deviation, seasonal maximum load, and seasonal minimum load from the decomposed low - rank and sparse components. The length of the feature vector for each area is \(16\). 4. **Similarity Graph Construction**: Construct a similarity graph and calculate the similarity between each pair of load areas through the radial basis function (RBF): \[ w_{ij}=e^{-\frac{\|z_i - z_j\|^2}{\lambda}} \] where \(z_i\) and \(z_j\) are the feature vectors of area \(i\) and \(j\), respectively, and the parameter \(\lambda\) controls the similarity scaling. 5. **Submodular Clustering Center Selection**: Propose a novel submodular optimization technique to determine the clustering centers. According to the constructed similarity graph, select the optimal clustering centers in sequence, thereby avoiding the repeated clustering process. The algorithm ensures the determinacy and stability of the results. 6. **Load Clustering Allocation**: According to the selected clustering centers, assign the remaining load areas to the most similar clustering centers to form the final clustering results. Through the above method, the paper aims to improve the clustering efficiency and accuracy of load areas at the transmission level and provide support for subsequent load analysis and applications. Experimental results show that the performance of this method on PJM load data is better than that of the traditional K - Means algorithm.