Parallel Hierarchical Clustering Algorithm Based on Preprocessed Data

Zhaopeng Li,LI Ken-li,Yun Cheng,Zhaojian Li
2010-01-01
Abstract:Hierarchial clustering technology plays a very important role in image processing,intrusion detection and bioinformatics applications,which is one of the most extensively studied branch in data mining.Presently the parallel hierarchical algorithms aren't very good at processing large data.To overcome this shortcoming,this paper proposed a new parallel algorithm based on preprocessed data.The proposed algorithms could cluster n objects with O(p) processors in O((λn)2/p) time,where 1≤p≤n/log n,0.1≤λ≤0.3.Performance comparisons show that it is the first parallel hierarchical clustering algorithm without memory conflicts,and thus it is an improved result over the past researches.
What problem does this paper attempt to address?