A Hierarchical Clustering Algorithm Based on Grid Partition

Hongbin Zhao,Qilong Han,Haiwei Pan
DOI: https://doi.org/10.1109/MEDIACOM.2010.46
2010-01-01
Abstract:In spatial data mining, the k-means algorithm is probably the most widely applied clustering method. But a major drawback of k-means algorithm is that it is difficult to determine the parameter k to represent natural cluster, and it is only suitable for concave spherical clusters. The paper presents an efficient clustering algorithm which combines the hierarchical approach with the grid partition. The hierarchical approach is applied to find the genuine clusters by repeatedly combining together these blocks. Hilbert curve is a continuous path which passes through every point in a space between the coordinates of the points and the one-dimensional sequence numbers of the points on the curve. The goal of using Hilbert curve is to preserve the distance of that points which are close in space and represent similar data should be stored close together in the linear order. The simulation shows that the clustering algorithm can have shorter execution time than k-means algorithms for the large databases. Moreover, the algorithm can deal with clusters with arbitrary shapes in which the k-means algorithm can not discover.
What problem does this paper attempt to address?