Distributed Algorithms for Connectivity and MST in Large Graphs with Efficient Local Computation
Eric Ajieren,Khalid Hourani,William K. Moses,Gopal Pandurangan,William K. Moses Jr.
DOI: https://doi.org/10.1145/3491003.3491011
2022-01-04
Abstract:We study distributed algorithms for large-scale graphs, focusing on the fundamental problems of connectivity and minimum spanning tree (MST). We consider the k-machine model, a well-studied model for distributed computing for large-scale graph computations, where k ≥ 2 machines jointly perform computations on graphs with n nodes (typically, n ≫ k). The input graph is assumed to be initially randomly partitioned among the k machines, a common implementation in many real-world systems. Communication is point-to-point, and the goal is to minimize the number of communication rounds (denoted Tc) of the computation. While communication is a significant factor that affects the time needed for large-scale computations, the computation cost incurred by the individual machines also contributes to the overall time complexity of the distributed algorithm. We posit a complexity measure called the local computation cost (denoted Tℓ) that measures the worst-case local computation cost among the machines. A lower bound for Tℓ in our model is Ω((m + n)/k + Δ + k), while a lower bound on Tc is Ω(n/k2) [Klauck et al., SODA 2015], where m is the number of edges and Δ is the maximum degree. Prior algorithms for connectivity and MST in the k-machine model [Klauck et al., SODA 2015, Pandurangan et al., SPAA 2016] do not take into account local computation; a straightforward local implementation of these algorithms is not optimal with respect to local computation. In this paper, we study several distributed algorithms for connectivity and MST and analyze their performance with respect to both the computation and communication cost. In particular, we analyze a well-studied flooding algorithm for connectivity and connected components that takes rounds and local computation time.1 We then present a deterministic filtering algorithm that has an improved round complexity of but local computation complexity of . Next, we present two deterministic algorithms which are increasingly sophisticated implementations of the classical Borůvka’s algorithm, the last of which has round complexity and local computation complexity . We finally present a randomized algorithm to find connected components with round complexity and local computation complexity that are both essentially optimal (up to polylogarithmic factors).