Lightweight Streaming Graph Partitioning by Fully Utilizing Knowledge from Local View

Zhigang Wang,Zichao Yang,Ning Wang,Yujie Du,Jie Nie,Zhiqiang Wei,Yu Gu,Ge Yu
DOI: https://doi.org/10.1109/icdcs57875.2023.00079
2023-01-01
Abstract:Data partitioning is the most fundamental procedure before parallelizing complex analysis on very big graphs. As a classical NP-complete problem, graph partitioning usually employs offline or online/streaming heuristics to find approximately optimal solutions. However, they are either heavyweight in space and time overheads or suboptimal in quality measured by workload balance and the number of cutting edges across partitions, both of which cannot scale well with the ever-growing demands of quickly analyzing big graphs. This paper thereby proposes a new vertex partitioner for better scalability. It preserves the lightweight advantage of existing streaming heuristics, and more importantly, fully utilizes the knowledge embedded in the local view when streaming a vertex, which significantly improves the quality. We present a sliding window technique to compensate for the additional memory costs caused by knowledge utilization. Also, a parallel technique with dependency detection optimization is designed to further enhance efficiency. Experiments on a spread of real-world datasets validate that our proposals can achieve overall success in terms of partitioning quality, memory consumption, and runtime efficiency.
What problem does this paper attempt to address?