Towards Fast Large-scale Graph Analysis via Two-dimensional Balanced Partitioning.
Shuai Lin,Rui Wang,Yongkun Li,Yinlong Xu,John C. S. Lui,Fei Chen,Pengcheng Wang,Lei Han
DOI: https://doi.org/10.1145/3545008.3545060
2022-01-01
Abstract:Distributed graph systems often leverage a cluster of machines by partitioning a large graph into multiple small-size subgraphs. Thus, graph partition usually has a significant impact on the performance of distributed graph systems. However, existing widely used partition schemes in practical graph systems can realize a good balance only in one dimension, e.g., either the number of vertices or the number of edges, and they may also incur lots of edge cuts. To address the problem, we develop BPart, which adopts a two-phase partition scheme to realize two-dimensional balance for both vertices and edges. Its core idea is to first partition the original graph into more small pieces than the cluster scale, and combine the partition to realize desired properties, then selectively combine the small pieces to construct larger subgraphs to generate two-dimensional balanced partition. We implement BPart into two open-source distributed graph systems, Gemini [58] and KnightKing [57]. Results show that BPart realizes good balance in both dimensions, and also significantly reduces the number of edge cuts. As a result, BPart reduces the total running time of various graph applications by 5% - 70%, compared to multiple existing partition schemes, e.g., Chunk-V, Chunk-E, Fennel, and Hash.