An improved density based distributed clustering

Zheng Miao-Miao,Ji Gen-Lin
DOI: https://doi.org/10.3321/j.issn:0469-5097.2008.05.010
2008-01-01
Abstract:A large number of data are distributed with the application of networks. Distributed clustering is a challenging research topic due to variety of the real-life constrains including bandwidth, the storage of the site memory, etc. An effective density-based distributed clustering algorithm (DBDC) is proposed to improve efficiency of the distributed clustering algorithm (DBDC). DBDC, which is combined with the Bayesian Information Criterion, only selecting less BIC-core-points to represent each local site, effectively decrease network overload and improves the quality of global clustering. DBDC is carried out on two different levels, i.e. the local level and the global level. On the local level, all sites carry out a DBSCAN clustering independently from each other. After having completed the clustering, a BIC-core-points local model is determined. Next the local model is transferred to a central site, where the local models are merged in order to form a global model on the global level by analyzing the local BIC-core-points. To each local representatives a global cluster-identifier is assigned. This resulting global clustering is broadcasted to all local sites. Then all local models are updated. Experimental results show that the efficiency of the algorithm DBDC is superior to that of the algorithm DBDC.
What problem does this paper attempt to address?