Approximating Diversified Top-K Graph Pattern Matching

Xin Wang,Huayi Zhan
DOI: https://doi.org/10.1007/978-3-319-98809-2_25
2018-01-01
Abstract:Graph pattern matching has been increasingly used in e.g., social network analysis. As the matching semantic is typically defined in terms of subgraph isomorphism, several problems are raised: (1) matching computation is often very expensive, due to the intractability of the problem, (2) the semantic is often too strict to identify meaningful matches, and (3) there may exist excessive matches which makes inspection very difficult. On the other hand, users are often interested in diversified top-k matches, rather than entire match set, since result diversification has been proven effective in improving users' satisfaction, and top-k matches not only eases result understanding but also can save the cost of matching computation. Motivated by these, this paper investigates approximating diversified top-k graph pattern matching. (1) We extend traditional notion of subgraph isomorphism by allowing edge to path mapping, and define matching based on the revised notion. With the extension, more meaningful matches could be captured. (2) We propose two functions for ranking matches: a relevance function w(.) based on tightness of connectivity, and a distance function d(.) measuring match diversity. Based on relevance and distance functions, we propose diversification function F(.), and formalize the diversified top-k graph pattern matching problem using F(.). (3) Despite hardness of the problem, we provide two approximation algorithms with performance guarantees, and one of them even preserves early termination property. (4) Using real-life and synthetic data, we experimentally verify that our approximation algorithms are effective, and outperform traditional matching algorithms.
What problem does this paper attempt to address?