FT-topo: Architecture-Driven Folded-Triangle Partitioning for Communication-efficient Graph Processing

Xinbiao Gan,Guang Wu,Ruigeng Zeng,Jiaqi Si,Cong Liu,Ji Liu,Daxiang Dong,Chunye Gong,Tiejun Li
DOI: https://doi.org/10.1145/3577193.3593729
2023-01-01
Abstract:As graph size (numbers of vertices and edges) is increasing from billions to trillions, efficient graph processing requires exascale computing clusters, which consist of hundreds of thousands of nodes connected via hierarchical networks with multiple levels of communication domains, e.g., multilevel triangle communication domains. While the computation of traversal-centric graph algorithms is relatively simple (e.g., status check), communication is the bottleneck due to the transfer of numerous small messages among hierarchical triangle communication domains. in this paper, we propose FT-topo, a communication-efficient graph partitioning policy for processing exascale graphs. The key idea of FT-topo is to directly map the big graph onto the hierarchical topology of exascale clusters. We carry out extensive experimentation by running various graph algorithms with synthetic graphs and real-world graphs on both Tianhe supercomputer and commercial clusters to show the advantages of FT-topo. FT-topo substantially mitigates communication overhead and thus is orders of magnitude faster than that of the state-of-the-art methods. In particular, FT-topo-based Tianhe supercomputer is superior to the fastest BFS and SSSP systems in the latest Graph500 lists. Furthermore, we deployed FT-topo on other large-scale clusters and it greatly improves graph processing performance on other commercial clusters. FT-topo-based graph operators outperforms the state-of-the-art graph partitioning and graph system by orders of magnitude on real-world graphs.
What problem does this paper attempt to address?