Adaptive Density Peaks Clustering Based on K-Nearest Neighbor and Gini Coefficient

Zehua Wang,Rui Sun,Wenke Zang,Dong Jiang,Xiyu Liu
DOI: https://doi.org/10.1109/ACCESS.2020.3003057
IF: 3.9
IEEE Access
Abstract:Density Peaks Clustering (DPC) is a density-based clustering algorithm that has the advantage of not requiring clustering parameters and detecting non-spherical clusters. The density peaks algorithm obtains the actual cluster center by inputting the cutoff distance and manually selecting the cluster center. Thus, the clustering center point is not selected on the basis of considering the whole data set. This paper proposes a method called G-KNN-DPC to calculate the cutoff distance based on the Gini coefficient and K-nearest neighbor. G-KNN-DPC first finds the optimal cutoff distance with Gini coefficient, and then the center point with the K-nearest neighbor. The automatic clustering center method can not only avoid the error that a cluster detects two center points but also effectively solve the traditional DPC algorithm defect that cannot handle complex data sets. Compared with DPC, Fuzzy C-Means, K-means, KDPC and DBSCAN, the proposed algorithm creates better clusters on different data sets.
Mathematics,Computer Science
What problem does this paper attempt to address?