Efficient Subgraph Join Based on Connectivity Similarity

Yue Wang,Hongzhi Wang,Jianzhong Li,Hong Gao
DOI: https://doi.org/10.1007/s11280-014-0286-0
2014-01-01
World Wide Web
Abstract:Graph is a widely accepted model of complex data representation. Graph model has been applied in many real applications including social networks, chemistry and pattern recognition, etc. The existence of noisy and inconsistent data makes graph similarity join imperative. The graph similarity join problem studied in this paper is to find graph pairs that can be joined due to similarity metrics. Exact graph join based on edit distance has been proved to be NP-hard, thus making approximate similarity join essential. In the paper, we propose a connectivity-similarity-based matching method, which is a new measure to evaluate graph similarity. We also apply a strategy called vertex similarity upper bound filtering in order to obtain a set of promising candidate pairs, which turns out to improve join efficiency. We perform experiments on real and synthetic graph databases to test proposed method, which is proven to achieve both good result quality and high efficiency among approximate join methods.
What problem does this paper attempt to address?