Graph Distance and Adaptive K-Nearest Neighbors Selection-Based Density Peak Clustering

Yuqin Sun,Jingcong Wang,Yuan Sun,Pengcheng Zhang,Tianyi Wang
DOI: https://doi.org/10.1109/access.2024.3403128
IF: 3.9
2024-05-28
IEEE Access
Abstract:Density Peak Clustering (DPC) is known for its rapid identification of cluster centers and successful clustering tasks. However, traditional DPC encounters several issues, which include simplifications in local density and distance metrics, a non-robust single-allocation strategy and limited fault tolerance. To address these challenges, this study introduces an innovative density peak clustering algorithm, named Graph Distance and Adaptive K-Nearest Neighbors Selection-Based Density Peak Clustering (GAK-DPC). Our goal with the approach is to enhance the algorithm's adaptability to non-linear and complex data structures. We achieve this by replacing the traditional Euclidean distance with graph distance. Additionally, we redefine the method for computing local density based on information from K-nearest neighbor data points. By introducing the concept of natural neighbors, the neighborhood radius r is obtained when all instances in the dataset have at least one natural neighbor. Then for the current data point, the number of data points falling within a circle centered on it with radius r is counted as the K-value of that data point. Thus, we achieve the adaptive selection of the K-value. This adaptive K-value strategy takes into account the dataset's characteristics and inter-point neighbor relationships, which enhances the algorithm's adaptability and robustness. Finally, we optimize the secondary allocation strategy for sample points to improve the algorithm's fault tolerance. By conducting comparisons with traditional clustering algorithms on UCI datasets and synthetic datasets, we demonstrate the effectiveness of GAK-DPC.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?