PAM Spatial Clustering Algorithm Research Based on CUDA

Enbo Zhou,Shanjun Mao,Mei Li,Zhenming Sun
DOI: https://doi.org/10.1109/geoinformatics.2016.7578971
2016-01-01
Abstract:K-Medoids algorithm plays an important role in spatial data mining for its ability to eliminate the outliers' influence. However, it faces several challenges including the initial Medoids' selection problem and the computational complexity with the increasing spatial data volume. In order to improve the result validity and time efficiency of the clustering process on massive spatial data, we modified the traditional K-Medoids algorithm Partitioning Around Medoids (PAM) to work efficiently on Graphics Processing Units (GPUs). We develop a matrix method instead of the distance computation between every ordinary point and medoids to reduce the data transfer between GPUs' global memory and shared memory. Besides, we use the Simulate Anneal Arithmetic (SAA) to find the initial medoids. Several experiments implemented on Compute Unified Device Architecture (CUDA) using different sizes of datasets demonstrate that the proposed algorithm can work efficiently and get a valid result.
What problem does this paper attempt to address?