Irregular Grid-Based Clustering over High-Dimensional Data Streams

GuiBin Hou,RuiXia Yao,JiaDong Ren,ChangZhen Hu
DOI: https://doi.org/10.1109/PCSPA.2010.195
2010-01-01
Abstract:Clustering high-dimensional data stream is a difficult and important problem. Grid-based algorithms are easily influenced by the size and borders of the grid. To overcome the weakness, we propose a new Irregualr Grid-based Clustering algorithm for high-dimensional data streams, called IGDCL. This method incorporates an irregular grid structure and subspace clustering algorithm. In this paper, an irregular grid structure is generated by means of splitting each dimension into different grid cells. With new data arriving, the irregular grid structure is dynamically adjusted. We assign a fading density function for each data point to embody the evolution of data streams. The final clusters are obtained in subspaces which are formed by dimensions associated with corresponding clusters. Experimental results demonstrate that IGDCL has higher clustering quality than CluStream.
What problem does this paper attempt to address?