An efficient PAM spatial clustering algorithm based on MapReduce

Jun Yue,Shanjun Mao,Mei Li,Xuesen Zou
DOI: https://doi.org/10.1109/GEOINFORMATICS.2014.6950803
2014-01-01
GEOINFORMATICS
Abstract:Clustering analysis has been a hot area of spatial data mining for several years. With the rapid development of the spatial information technology, the amount of spatial data is growing exponentially and it makes spatial clustering of massive spatial data a challenging task. Aiming to improve the efficiency of the clustering process on massive spatial data, an implementation of parallel Partitioning Around Medoids (PAM) spatial clustering algorithm based on MapReduce is proposed. The experiments on Hadoop and HBase demonstrate that the proposed algorithm can process massive spatial data efficiently and scale well on commodity hardware.
What problem does this paper attempt to address?