WindGP: Efficient Graph Partitioning on Heterogenous Machines

Li Zeng,Haohan Huang,Binfan Zheng,Kang Yang,Shengcheng Shao,Jinhua Zhou,Jun Xie,Rongqian Zhao,Xin Chen
2024-03-06
Abstract:Graph Partitioning is widely used in many real-world applications such as fraud detection and social network analysis, in order to enable the distributed graph computing on large graphs. However, existing works fail to balance the computation cost and communication cost on machines with different power (including computing capability, network bandwidth and memory size), as they only consider replication factor and neglect the difference of machines in realistic data centers. In this paper, we propose a general graph partitioning algorithm WindGP, which can support fast and high-quality edge partitioning on heterogeneous machines. WindGP designs novel preprocessing techniques to simplify the metric and balance the computation cost according to the characteristics of graphs and machines. Also, best-first search is proposed instead of BFS and DFS, in order to generate clusters with high cohesion. Furthermore, WindGP adaptively tunes the partition results by sophisticated local search methods. Extensive experiments show that WindGP outperforms all state-of-the-art partition methods by 1.35 - 27 times on both dense and sparse distributed graph algorithms, and has good scalability with graph size and machine number.
Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper primarily addresses the issue of efficient graph partitioning on heterogeneous machines. Specifically: 1. **Problems with Existing Methods**: - Existing graph partitioning algorithms are mainly designed for homogeneous machines (i.e., all machines have the same configuration) and cannot effectively balance computation and communication costs. - When applied to heterogeneous environments (where different machines have varying computational power, memory size, and network bandwidth), these algorithms perform poorly. 2. **Research Background and Objectives**: - Graph partitioning is crucial in many practical applications, such as financial fraud detection and social network analysis. - In large-scale distributed graph computing, the quality of graph partitioning directly impacts overall performance. - Heterogeneous computing is becoming a trend, but existing graph partitioning algorithms do not support heterogeneous environments. 3. **Proposed Method**: - A new graph partitioning algorithm, WindGP, is proposed, which can achieve fast and high-quality edge partitioning on heterogeneous machines. - New preprocessing techniques are introduced to simplify metrics and balance computation costs based on the characteristics of the graph and machines. - Best-first search is used instead of traditional breadth-first search or depth-first search to generate highly cohesive partitions. - A complex subgraph local search method is employed to adaptively adjust partitioning results. 4. **Experimental Results**: - Experiments show that WindGP outperforms all existing methods in both dense and sparse distributed graph algorithms, with performance improvements ranging from 1.35 times to 27 times, and it exhibits good scalability. Through these improvements, WindGP can achieve better graph partitioning in heterogeneous environments, thereby enhancing the overall performance of distributed graph computing.