Efficient subhypergraph matching based on hyperedge features
Yuhang Su,Yu Gu,Zhigang Wang,Ying Zhang,Jianbin Qin,Ge Yu
DOI: https://doi.org/10.1109/tkde.2022.3160393
IF: 9.235
2022-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:Hypergraphs consist of vertices and hyperedges that can connect multiple vertices. Since hypergraphs can effectively simulate complex intergroup relationships between entities, they have a wide range of applications such as computer vision and bioinformatics. In this paper, we study the subhypergraph matching problem, which is one of the most challenging problems in the processing of the hypergraphs. We aim to extract all subhypergraph isomorphism embeddings of a query hypergraph $q$q in a large data hypergraph $D$D. The existing methods on subgraph matching are designed for the ordinary graphs, which typically achieve the goal by three phases, i.e., filtering candidate vertices, refining candidate sets, and then enumeration final results in some matching order. However, such a design cannot be trivially extended to efficiently handle hypergraphs due to the inherent difference between ordinary graphs and hypergraphs. This motivates us to enhance the performance by exploiting hyperedge features, such as the typical intersections and inclusion relations between hyperedges. In our work, we present an efficient subhypergraph matching solution with two novel techniques, maximum hyperedge candidate filtering and co-occurrence matrix candidate refinement strategy. Maximum hyperedge candidate filtering is a filtering method based on hyperedge features, which can provide powerful pruning capability. Co-occurrence matrix candidate refinement strategy considers the high-order relationship between vertices in the hypergraph and provides an effective candidate refinement scheme to further reduce the overall search space. In order to find more effective matching order, we design a new enumeration strategy, which calculates the pseudo-isomorphic mapping set and then performs hyperedge verification. On real and synthetic data sets, we conduct extensive experiments to show our method outperforms existing methods by up to 2 orders of magnitude.
computer science, information systems, artificial intelligence,engineering, electrical & electronic