A Joint Grid Segmentation Based Affinity Propagation Clustering Method for Big Data

Xiaolu Zhu,Jinglin Li,Zhihan Liu,Fangchun Yang
DOI: https://doi.org/10.1109/hpcc-smartcity-dss.2016.0172
2016-01-01
Abstract:Clustering is useful for discovering underlying groups and identifying interesting patterns in scientific data and engineering systems. Affinity propagation (AP) is an effective clustering algorithm which has been successfully applied to broad areas of computer science. To generate high quality clusters, AP iteratively performs information propagation on the full similarity matrix and requires excessive time to exchange messages between data points. This paper proposes a novel AP clustering method based on grid segmentation. The main ideas of our approach are: (1) to partition the data points into multiple non-overlapping sub-sets to simplify representation of huge data points into smaller sub-sets, (2) to construct sparse similarity matrix to decrease the unnecessary message exchanges. Experimental evaluations on large-scale real-world datasets demonstrate our proposed method has superior performance in effectiveness and efficiency.
What problem does this paper attempt to address?