Efficient Parallel Subgraph Enumeration

Yingxia Shao,Bin Cui,Lei Chen
DOI: https://doi.org/10.1007/978-981-15-3928-2_4
2020-01-01
Abstract:In this chapter, we introduce a novel parallel subgraph enumeration framework, named PSgL, which is built on top of Pregel-like graph computing systems. The PSgL iteratively enumerates subgraph instances and solves the subgraph enumeration in a divide-and-conquer fashion. The framework completely relies on the graph traversal operation instead of the explicit join operation. To achieve the high efficiency of the framework, we propose several algorithm-specific optimization techniques for balancing the workload and reducing the size of intermediate results. In respect to the workload balance, we theoretically prove the problem of partial subgraph instance distribution is NP-hard, and carefully design heuristic strategies. To reduce the massive intermediate results, we develop three mechanisms, which are automorphism breaking of the pattern graph, initial pattern vertex selection based on a cost model, and a pruning method based on a light-weight index. We implemented the prototype of PSgL, and conducted comprehensive experiments of various graph enumeration operations on real-world large graphs. The experimental results clearly demonstrate that PSgL is robust and can achieve performance gain over the existing considerable solutions up to 90%.
What problem does this paper attempt to address?