Optimal Parallel Algorithms for Dendrogram Computation and Single-Linkage Clustering

Laxman Dhulipala,Xiaojun Dong,Kishen N Gowda,Yan Gu
2024-05-13
Abstract:Computing a Single-Linkage Dendrogram (SLD) is a key step in the classic single-linkage hierarchical clustering algorithm. Given an input edge-weighted tree $T$, the SLD of $T$ is a binary dendrogram that summarizes the $n-1$ clusterings obtained by contracting the edges of $T$ in order of weight. Existing algorithms for computing the SLD all require $\Omega(n\log n)$ work where $n = |T|$. Furthermore, to the best of our knowledge no prior work provides a parallel algorithm obtaining non-trivial speedup for this problem.
Data Structures and Algorithms,Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?