Resource Constrained Data Stream Clustering with Concept Drifting for Processing Sensor Data

Gansen Zhao,Zhongjie Ba,Jiahua Du,Xinming Wang,Ziliu Li,Chunming Rong,Changqin Huang
DOI: https://doi.org/10.4018/ijdwm.2015070103
2015-01-01
International Journal of Data Warehousing and Mining
Abstract:Wireless sensors and mobile devices have been widely deployed as data collecting devices for monitoring real world systems. A large amount of stream data is generated in real-time, which has to be processed in real-time as well. One of the common processing operations is clustering that automatically groups the elements of a stream into a number of clusters in general. Elements of the same cluster have maximum similarity and elements of different clusters have minimum similarity. This paper proposes an on-demand framework SRAStream based on the concept drifting detection mechanism. The concept drifting detection algorithm is used to measure the distance of the new clusters for the current data and that of the existing clusters. Only when a concept drifting occurs will the re-clustering be performed to identify new clusters. SRAStream thus avoids the unnecessary computation intensive re-clustering calculation. Experiments suggest that the proposed framework does work well and improve the processing speed greatly in data streams clustering.
What problem does this paper attempt to address?