Subspace Clustering of High Dimensional Data Streams

Shuyun Wang,Yingjie Fan,Chenghong Zhang,HeXiang Xu,Xiulan Hao,Yunfa Hu
DOI: https://doi.org/10.1109/ICIS.2008.58
2008-01-01
Abstract:In this paper, SOStream, which is a novel algorithm of clustering over high dimensional online data stream is presented, it is based on subspace.-SOStream partitions the data space into grids, and maintains a superset of all dense units in an online way. A deterministic lower and upper bound of the selectivity of each maintained units are also given. With the maintained potential dense units, SOStream is capable of discovering the clusters in different subspaces over high dimensional data stream with arbitrary shape. The experimental results on real and synthetic datasets demonstrate the effectivity of the approach.
What problem does this paper attempt to address?