DPCG: an Efficient Density Peaks Clustering Algorithm Based on Grid

Xiao Xu,Shifei Ding,Mingjing Du,Yu Xue
DOI: https://doi.org/10.1007/s13042-016-0603-2
2016-01-01
International Journal of Machine Learning and Cybernetics
Abstract:To deal with the complex structure of the data set, density peaks clustering algorithm (DPC) was proposed in 2014. The density and the delta-distance are utilized to find the clustering centers in the DPC method. It detects outliers efficiently and finds clusters of arbitrary shape. But unfortunately, we need to calculate the distance between all data points in the first process, which limits the running speed of DPC algorithm on large datasets. To address this issue, this paper introduces a novel approach based on grid, called density peaks clustering algorithm based on grid (DPCG). This approach can overcome the operation efficiency problem. When calculating the local density, the idea of the grid is introduced to reduce the computation time based on the DPC algorithm. Neither it requires calculating all the distances nor much input parameters. Moreover, DPCG algorithm successfully inherits the all merits of the DPC algorithm. Experimental results on UCI data sets and artificial data show that the DPCG algorithm is flexible and effective.
What problem does this paper attempt to address?