Entropy Based Clustering of Data Streams with Mixed Numeric and Categorical Values

Shuyun Wang,Yingjie Fan,Chenghong Zhang,HeXiang Xu,Xiulan Hao,Yunfa Hu
DOI: https://doi.org/10.1109/ICIS.2008.57
2008-01-01
Abstract:In is paper, a novel algorithm for clustering data streams with mixed numeric and categorical attributes (CNC-Stream)is proposed. A new similarity measure based on entropy determining the similarity between the objects (data points in the stream or the micro-clusters in memory) is also presented here, which makes CNC-Stream work. The experiments conducted on the real data sets and synthetic data sets show that the proposed method is of high quality.
What problem does this paper attempt to address?