Practical path-based methods for clustering arbitrary shaped data sets

Cong Liu,Aimin Zhou,Qiannan Du,Guixu Zhang
DOI: https://doi.org/10.1109/ICNC.2013.6818115
2013-01-01
Abstract:Path-based clustering is a well-known method for extracting arbitrary shaped clusters. However, its high time complexity limits some possible applications. In this paper, we propose two new algorithms to speed up the original path-based method. A basic method focuses on the path-distance calculation. A modified Floyd algorithm is applied to reduce the time complexity from Θ(n2m + n3 log n) to Θ(n3 + nk). An improved method emphasizes large scale data sets. A preprocess is used to reduce the number of data points to the path-based algorithm. Moreover, this algorithm can automatic determine the number of clusters by a box clustering. The new approaches are applied to a variety of test data sets with arbitrary shapes and the experimental results show that our method is efficient in dealing with the given problems.
What problem does this paper attempt to address?