Effective Similarity Search on Heterogeneous Networks: A Meta-path Free Approach
Yue Wang,Zhe Wang,Ziyuan Zhao,Zijian Li,Xun Jian,Hao Xin,Lei Chen,Jianchun Song,Zhenhong Chen,Meng Zhao
DOI: https://doi.org/10.1109/tkde.2020.3019488
IF: 9.235
2020-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:Heterogeneous information networks (HINs) are usually used to model information systems with multi-type objects and relations. In contrast, graphs that have a single type of nodes and edges, are often called homogeneous graphs. Measuring similarities among objects is an important task in data mining applications, such as web search, link prediction, and clustering. Currently, several similarity measures are defined for HINs. Most of these measures are based on meta-paths, which show sequences of node classes and edge types along the paths between two nodes. However, meta-paths, which are often designed by domain experts, are hard to enumerate and choose w.r.t. the quality of similarity scores. This makes using existing similarity measures in real applications difficult. To address this problem, we extend SimRank, a well-known similarity measure on homogeneous graphs, to HINs, by introducing the concept of the decay graph. The newly proposed similarity measure is called HowSim, which has the property of being meta-path free, and capturing the structural and semantic similarity simultaneously. The generality and effectiveness of HowSim, and the efficiency of our proposed algorithms for computing HowSim scores, are demonstrated by extensive experiments.
computer science, information systems, artificial intelligence,engineering, electrical & electronic