Density peaks clustering algorithm based on improved similarity and allocation strategy

Shifei Ding,Wei Du,Chao Li,Xiao Xu,Lijuan Wang,Ling Ding
DOI: https://doi.org/10.1007/s13042-022-01711-7
2023-03-19
International Journal of Machine Learning and Cybernetics
Abstract:Density peaks clustering (DPC) algorithm provides an efficient method to quickly find cluster centers with decision graph. In recent years, due to its unique parameter, no iteration, and good robustness, DPC has been widely studied and applied. However, it also has some shortcomings, such as unable to effectively identify cluster centers and the chain reaction caused by non-central points error allocation. Aiming at these two shortcomings of DPC, an improved density peaks clustering based on variance (DPCV) is proposed. First, the algorithm uses the variance between points to improve similarity and reduce the density difference of unevenly distributed data sets. Then, according to the similar density relationship between a cluster center and surrounding points, the low-density points are used as the dividing boundary of the initial allocation process. In order to optimize the time consumption of calculating the variance, this paper replaces the variance with the Manhattan distance between points and proposes density peaks clustering based on Manhattan distance (MDDPC). Theoretical analysis and experiments on artificial data and UCI data sets show that, compared with DPC and its improved algorithms, DPCV and MDDPC further improve the clustering accuracy of the DPC algorithm while controlling the running time.
computer science, artificial intelligence
What problem does this paper attempt to address?