Hybrid Subgraph Matching Framework Powered by Sketch Tree for Distributed Systems

Yuejia Zhang,Weiguo Zheng,Zhijie Zhang,Peng,Xuecang Zhang
DOI: https://doi.org/10.1109/icde53745.2022.00082
2022-01-01
Abstract:With the rapid growth of graph scale, challenges emerge for subgraph search when the data graph cannot reside in the memory of a single machine. It is important to develop practical algorithms to answer subgraph queries in distributed systems and has attracted extensive attention in recent years. The existing join-based algorithms are natively supported in many distributed engines, but they often suffer from a large number of invalid intermediate results and duplicate computation. The exploration-based algorithms minimize invalid intermediate results, while they are likely to produce results of exponential size. In this paper, we propose an efficient hybrid subgraph matching framework that integrates the advantages of both join-based and exploration-based paradigms. We formulate a novel decomposition for the query graph, namely sketch tree, which can reduce invalid intermediate results and avoid duplicate computation. We implement the proposed algorithm in the Pregel + system and optimize the communication cost powered by the sketch tree. Extensive experiments on real graphs demonstrate that our proposed algorithm significantly outperforms the state-of-the-art join-based and exploration-based methods.
What problem does this paper attempt to address?