BENU: Distributed Subgraph Enumeration with Backtracking-Based Framework

Zhaokang Wang,Rong Gu,Weiwei Hu,Chunfeng Yuan,Yihua Huang
DOI: https://doi.org/10.1109/icde.2019.00021
2019-01-01
Abstract:Given a small pattern graph and a large data graph, the task of subgraph enumeration is to find all the subgraphs of the data graph that are isomorphic to the pattern graph. The state-of-the-art distributed algorithms like SEED and CBF turn subgraph enumeration into a distributed multi-way join problem. They are inefficient in communication as they have to shuffle partial matching results that are much larger than the data graph itself during the join. They also spend non-trivial costs on constructing indexes for data graphs. Different from those join-based algorithms, we develop a new backtracking-based framework BENU for distributed subgraph enumeration. BENU divides a subgraph enumeration task into a group of local search tasks that can be executed in parallel. Each local search task follows a backtracking-based execution plan to enumerate subgraphs. The data graph is stored in a distributed database and is queried as needed. BENU only queries the necessary edges of the data graph and avoids shuffling partial matching results. We also develop an efficient implementation for BENU. We set up an in-memory database cache on each machine. Taking advantage of the inter-task and intra-task locality, the cache significantly reduces the communication cost with controllable memory usage. We conduct extensive experiments to evaluate the performance of BENU. The results show that BENU is scalable and outperforms the state-of-the-art methods by up to an order of magnitude.
What problem does this paper attempt to address?