Density Based Distribute Data Stream Clustering Algorithm.

Bing Gao,Jianpei Zhang
DOI: https://doi.org/10.4304/jsw.8.2.435-442
2013-01-01
Abstract:To solve the problem of distributed data streams clustering, the algorithm DB-DDSC (Density-Based Distribute Data Stream Clustering) was proposed. The algorithm consisted of two stages. First presented the concept of circular-point based on the representative points and designed the iterative algorithm to find the density-connected circular-points, then generated the local model at the remote site. Second designed the algorithm to generate global clusters by combining the local models at coordinator site. The DB-DDSC algorithm can find the the clusters of different shapes under the distributed data stream environment, avoid frequently sending data by using the test-update algorithm, and reduce the data transmission. The experiments show that the DB-DDSC algorithm is feasible and scale expandable.
What problem does this paper attempt to address?