Efficient Exact Subgraph Matching Via GNN-Based Path Dominance Embedding

Yutong Ye,Xiang Lian,Mingsong Chen
DOI: https://doi.org/10.14778/3654621.3654630
2024-01-01
Abstract:The classic problem of exact subgraph matching returns those subgraphs in alarge-scale data graph that are isomorphic to a given query graph, which hasgained increasing importance in many real-world applications such as socialnetwork analysis, knowledge graph discovery in the Semantic Web,bibliographical network mining, and so on. In this paper, we propose a noveland effective graph neural network (GNN)-based path embedding framework(GNN-PE), which allows efficient exact subgraph matching without introducingfalse dismissals. Unlike traditional GNN-based graph embeddings that onlyproduce approximate subgraph matching results, in this paper, we carefullydevise GNN-based embeddings for paths, such that: if two paths (and 1-hopneighbors of vertices on them) have the subgraph relationship, theircorresponding GNN-based embedding vectors will strictly follow the dominancerelationship. With such a newly designed property of path dominance embeddings,we are able to propose effective pruning strategies based on pathlabel/dominance embeddings and guarantee no false dismissals for subgraphmatching. We build multidimensional indexes over path embedding vectors, anddevelop an efficient subgraph matching algorithm by traversing indexes overgraph partitions in parallel and applying our pruning methods. We also proposea cost-model-based query plan that obtains query paths from the query graphwith low query cost. Through extensive experiments, we confirm the efficiencyand effectiveness of our proposed GNN-PE approach for exact subgraph matchingon both real and synthetic graph data.
What problem does this paper attempt to address?