PANDORA: A Parallel Dendrogram Construction Algorithm for Single Linkage Clustering on GPU

Piyush Sao,Andrey Prokopenko,Damien Lebrun-Grandié
2024-01-12
Abstract:This paper presents \pandora, a novel parallel algorithm for efficiently constructing dendrograms for single-linkage hierarchical clustering, including \hdbscan. Traditional dendrogram construction methods from a minimum spanning tree (MST), such as agglomerative or divisive techniques, often fail to efficiently parallelize, especially with skewed dendrograms common in real-world data.
Machine Learning,Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: on large - scale datasets, the existing dendrogram construction methods in single - linkage clustering are inefficient, especially when dealing with highly skewed dendrograms, and these methods are difficult to parallelize. Specifically, the traditional agglomerative or divisive techniques for dendrogram construction on the minimum spanning tree (MST) cannot be efficiently parallelized on multi - thread accelerators such as GPU, especially in the case of skewed dendrograms which are common in real - data. ### Specific Problem Description 1. **Limitations of Existing Methods**: - Traditional methods such as the agglomerative method and the divisive method perform poorly when dealing with skewed dendrograms. - These methods are difficult to be efficiently parallelized on GPU, resulting in a long computing time. - In large - scale low - dimensional datasets, dendrogram construction becomes the main bottleneck, taking up most of the computing time. 2. **Challenges in Practical Applications**: - For large - scale datasets in fields such as astronomy and bioinformatics, the time consumption of dendrogram construction is too high, which affects the performance of the overall algorithm. - Especially in the Hdbscan* algorithm, MST construction can be efficiently completed on GPU, but dendrogram construction still depends on CPU, resulting in a performance bottleneck. ### Goals of the Paper To solve the above problems, the paper proposes the Pandora algorithm, which aims to efficiently construct the dendrogram of single - linkage clustering through a novel recursive tree - contraction method, and this algorithm is fully parallelized and suitable for running on multi - thread accelerators such as GPU. Specific goals include: - **Improve Parallelization Efficiency**: Ensure that all steps can be efficiently executed in a multi - thread environment and fully utilize the powerful computing power of GPU. - **Handle Skewed Dendrograms**: The algorithm can effectively handle highly skewed dendrograms and avoid the performance degradation of traditional methods when dealing with such structures. - **Optimize Computational Complexity**: Through the recursive tree - contraction method, simplify the initial dendrogram construction process, and then gradually reconstruct the complete dendrogram to ensure the optimal computational complexity of the algorithm in the worst - case scenario. ### Main Contributions - Propose a novel parallel dendrogram construction algorithm based on tree - contraction. - Analyze the necessary and sufficient conditions of edge - contraction techniques. - Derive the asymptotic lower - bound complexity of any dendrogram construction algorithm and prove the work - optimality of the Pandora algorithm. - Provide an efficient performance - portable implementation that supports three different architectures (including GPU), significantly improving the computing speed of the Hdbscan* algorithm on GPU. Through these improvements, the Pandora algorithm can significantly improve the efficiency of dendrogram construction on large - scale datasets, especially achieving a 15 - 37 - fold speed - up on GPU.