Research on parallel clustering of power load based on improved K- Means algorithm

Yuanbin XU,Guohui LI,Kun GUO,Songrong GUO,Wei LIN
DOI: https://doi.org/10.3778/j.issn.1002-8331.1603-0110
2017-01-01
Abstract:The electrical power enterprise usually based on power load data, uses the traditional K-Means algorithm to classify the customers, but the biggest drawback of this method must be specified by the user manual clustering number of clusters. It proposes a method combining Canopy algorithm and K-Means algorithm based on load clustering, without the need to manually specify the number of clusters, the automatic division of the customer. First of all, it collects users' electricity data, uses the parallel computing framework MapReduce to preprocess the original data. Then, it uses Canopy and K-Means algorithm to establish the clustering model of automatic load. Finally, in the real consumption data on the empirical analysis, by using the Silhouette index to evaluate, it shows that the proposed method is more stable and conve-nient, and has wider applicability.
What problem does this paper attempt to address?