Practical and high-quality partitioning algorithm for large-scale and time-evolving graphs

Xiangyu Luo,Yingxiao Luo,Gang Xin,Xiaolin Gui,Jia Wang,Cheng Guo
DOI: https://doi.org/10.1016/j.knosys.2021.107211
IF: 8.139
2021-01-01
Knowledge-Based Systems
Abstract:With the appearance of widespread large-scale graph data, distributed graph processing grows in popularity. In order to store and analyze a large-scale graph in a distributed manner, the graph should be properly partitioned in advance. Although the partitioning of static graphs has been sufficiently investigated, the emerging graphs nowadays are typically dynamic or time-evolving. State-of-the-art partitioning algorithms for such large-scale and time-evolving graphs either are impractical because of the complexity caused by the global optimization or sacrifice partitioning quality due to a huge number of cross partition edges. This paper investigates the problem and proposes a new practical and high-quality graph partitioning algorithm. First, graph updates accumulated in a time interval are expressed as a delta graph. Second, vertices in the delta graph are classified into several subsets based on a non-overlapping community detection method. Third, vertices in the same subset are assigned to a properly chosen partition as a whole. The chosen partition is lightly loaded as well as closely related with vertices in the subset. Experimental results show that compared with the widely used hash-based and heuristic-based partitioning algorithms, our proposed algorithm gains significant decrease in terms of the number of cross partition edges while maintaining a similar level of load balance.
What problem does this paper attempt to address?