Improving Density Peaks Clustering Through GPU Acceleration
Zhuojin Liu,Shufeng Gong,Yuxuan Su,Changyi Wan,Yanfeng Zhang,Ge Yu
DOI: https://doi.org/10.1016/j.future.2022.11.033
IF: 7.307
2023-01-01
Future Generation Computer Systems
Abstract:Density Peaks Clustering (DPC) is a recently proposed clustering algorithm that has distinct advantages over existing clustering algorithms, which has already been used in a wide range of applications. However, DPC requires computing the distance between every pair of input points, therefore incurring quadratic computation overhead, which is prohibitive for large data sets. To address this efficiency problem, we propose to use GPU to accelerate DPC. We exploit a spatial index structure VP-Tree to efficiently maintain the data points and propose a GPU-friendly parallel VP-Tree construction algorithm. Based on the constructed VP-Tree, we propose a GPU-Accelerated DPC algorithm GDPC, in which the all-pair computation in DPC is greatly accelerated. Furthermore, in order to process dynamic evolving datasets, we propose an incremental GDPC algorithm, Incremental GDPC. Our results show that GDPC can achieve over 5.3-148.9X acceleration compared to the state-of-the-art GPU-based, multicore-based, and distributed DPC implementations, a 2.3-40.5X acceleration compared to the state-of-the-art incremental DPC algorithm.