Research on data stream clustering algorithms

Shifei Ding,Fulin Wu,Jun Qian,Hongjie Jia,Fengxiang Jin
DOI: https://doi.org/10.1007/s10462-013-9398-7
IF: 9.588
2013-01-01
Artificial Intelligence Review
Abstract:Data stream is a potentially massive, continuous, rapid sequence of data information. It has aroused great concern and research upsurge in the field of data mining. Clustering is an effective tool of data mining, so data stream clustering will undoubtedly become the focus of the study in data stream mining. In view of the characteristic of the high dimension, dynamic, real-time, many effective data stream clustering algorithms have been proposed. In addition, data stream information are not deterministic and always exist outliers and contain noises, so developing effective data stream clustering algorithm is crucial. This paper reviews the development and trend of data stream clustering and analyzes typical data stream clustering algorithms proposed in recent years, such as Birch algorithm, Local Search algorithm, Stream algorithm and CluStream algorithm. We also summarize the latest research achievements in this field and introduce some new strategies to deal with outliers and noise data. At last, we put forward the focal points and difficulties of future research for data stream clustering.
What problem does this paper attempt to address?