A Large-Scale Graph Partition Algorithm with Redundant Multi-Order Neighbor Vertex Storage

Huanqing Cui,Di Yang,Chuanai Zhou
DOI: https://doi.org/10.1016/j.ins.2024.120473
IF: 8.1
2024-03-21
Information Sciences
Abstract:Recently, graph data become increasingly important and larger in many fields, and many distributed graph computing systems have been proposed to deal with large-scale graphs. The graph partition algorithm is the basis of these systems. In many graph applications, a vertex updates its feature by aggregating information of its multi-order neighboring vertices, and the existing graph partition algorithms often lead to heavy communication cost. This paper proposes a graph partition algorithm to store the multi-order neighbor vertices redundantly to avoid the communication requirement. It first formulates the problem of graph partitioning with redundant multi-order neighbor vertices storage as an optimization problem, and then proposes PARN (partition algorithm with redundant multi-order neighbor) algorithm to tackle it. The PARN algorithm consists of initial partition and vertex migration based on genetic algorithm, and it classifies the vertices of a partition into primary and auxiliary vertices. The auxiliary vertices required for the primary vertices in a partition are stored in the same partition, so the primary vertices don't need to fetch the information from the other partitions. The experiment results show that the proposed algorithm is comparable to Hash, GAP, GNP, and QCLP algorithms in terms of load balance and produces fewer auxiliary vertices than these algorithms.
computer science, information systems
What problem does this paper attempt to address?