UniWalk: Unidirectional Random Walk Based Scalable SimRank Computation over Large Graph

Junshuai Song,Xiongcai Luo,Jun Gao,Chang Zhou,Hu Wei,Jeffery Xu Yu
DOI: https://doi.org/10.1109/tkde.2017.2779126
IF: 9.235
2018-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:SimRank is an effective structural similarity measurement between two vertices in a graph, which can be used in many applications like recommender systems. Although progresses have been achieved, existing methods still face challenges to handle large graphs. Besides huge index construction and maintenance cost, the existing methods require considerable search space and time overheads in the online SimRank query. In this paper, we design a Monte Carlo based method, Uni-Walk, to enable the fast top-k SimRank computation over large undirected graphs without indexing. UniWalk directly locates the top-k similar vertices for any single source vertex u via O(R) sampling paths originating from u only, which avoids the selection of candidate vertex set C and the following O(|C|R) bidirectional sampling paths starting from u and each candidate respectively in existing methods. We also design a space-efficient method to reduce intermediate results, and a path-sharing strategy to optimize path sampling for multiple source vertices. Furthermore, we extend UniWalk to existing distributed graph processing frameworks to improve its scalability. We conduct extensive experiments to illustrate that UniWalk has high scalability, and outperforms the state-of-the-art methods by orders of magnitude, and such an improvement is achieved without any indexing overheads.
What problem does this paper attempt to address?