Mixed Clustering Algorithm Oriented to Distributed Data Streams

Mao Guo-jun
2011-01-01
Abstract:Data is often collected over a distributed network in the daily life , so the research to distributed data streaming model has recently gained a high attraction due to its applications. In fact, most ongoing studies for mining distributed data streams are suffering from the problems of accuracy or efficiency. In this paper, one improved synopsis data structure for summarizing data streams is designed, one effective distributed clustering algorithm named EMCA in an incremental way is presented. Experiments show that EMCA algorithm has less communication cost and higher clustering qualities.
What problem does this paper attempt to address?