A Clustering Algorithm Based on Grid Partition of Space-Filling Curve

Hong-bo Xu,Zhong-xiao Hao,Qi-long Han
DOI: https://doi.org/10.1109/ICICSE.2009.17
2010-01-01
Abstract:In spatial data mining, clustering is one of useful techniques for discovering interesting data in the underlying data objects. The k-means algorithm is probably the most widely applied clustering method. But a major drawback of k-means algorithm is that it is difficult to determine the parameter k to represent natural cluster, and it is only suitable for concave spherical clusters. Therefore, the paper presents an efficient clustering algorithm for large spatial databases. It combines the hierarchical approach with the grid partition. The hierarchical approach is applied to find the genuine clusters by repeatedly combining together these blocks. Hilbert curve is a continuous path which passes through every point in a space once to form a one-one correspondence between the coordinates of the points and the one-dimensional sequence numbers of the points on the curve. The goal of using Hilbert curve is to preserve the distance of that points which are close in space and represent similar data should be stored close together in the linear order. This kind of mapping also can minimize the disk access effort and provide high speed for clustering. The simulation shows that the clustering algorithm can have shorter execution time than k-means algorithms for the large databases. Moreover, the algorithm can deal with clusters with arbitrary shapes in which the k-means algorithm can not discover.
What problem does this paper attempt to address?