Abstract:SimRank is an appealing pair-wise similarity measure based on graph structure. It iteratively follows the intuition that two nodes are assessed as similar if they are pointed to by similar nodes. Many real graphs are large, and links are constantly subject to minor changes. In this article, we study the efficient dynamical computation of all-pairs SimRanks on time-varying graphs. Existing methods for the dynamical SimRank computation [e.g., LTSF (Shao et al. in PVLDB 8(8):838–849, 2015) and READS (Zhang et al. in PVLDB 10(5):601–612, 2017)] mainly focus on top-k search with respect to a given query. For all-pairs dynamical SimRank search, Li et al.’s approach (Li et al. in EDBT, 2010) was proposed for this problem. It first factorizes the graph via a singular value decomposition (SVD) and then incrementally maintains such a factorization in response to link updates at the expense of exactness. As a result, all pairs of SimRanks are updated approximately, yielding \(O({r}^{4}n^2)\) time and \(O({r}^{2}n^2)\) memory in a graph with n nodes, where r is the target rank of the low-rank SVD. Our solution to the dynamical computation of SimRank comprises of five ingredients: (1) We first consider edge update that does not accompany new node insertions. We show that the SimRank update \({\varvec{\Delta }}{} \mathbf{S}\) in response to every link update is expressible as a rank-one Sylvester matrix equation. This provides an incremental method requiring \(O(Kn^2)\) time and \(O(n^2)\) memory in the worst case to update \(n^2\) pairs of similarities for K iterations. (2) To speed up the computation further, we propose a lossless pruning strategy that captures the “affected areas” of \({\varvec{\Delta }}{} \mathbf{S}\) to eliminate unnecessary retrieval. This reduces the time of the incremental SimRank to \(O(K(m+|{\textsf {AFF}}|))\), where m is the number of edges in the old graph, and \(|{\textsf {AFF}}| \ (\le n^2)\) is the size of “affected areas” in \({\varvec{\Delta }}{} \mathbf{S}\), and in practice, \(|{\textsf {AFF}}| \ll n^2\). (3) We also consider edge updates that accompany node insertions, and categorize them into three cases, according to which end of the inserted edge is a new node. For each case, we devise an efficient incremental algorithm that can support new node insertions and accurately update the affected SimRanks. (4) We next study batch updates for dynamical SimRank computation, and design an efficient batch incremental method that handles “similar sink edges” simultaneously and eliminates redundant edge updates. (5) To achieve linear memory, we devise a memory-efficient strategy that dynamically updates all pairs of SimRanks column by column in just \(O(Kn+m)\) memory, without the need to store all \((n^2)\) pairs of old SimRank scores. Experimental studies on various datasets demonstrate that our solution substantially outperforms the existing incremental SimRank methods and is faster and more memory-efficient than its competitors on million-scale graphs.

Fast All-Pairs SimRank Assessment on Large Graphs and Bipartite Domains

Towards Efficient SimRank Computation on Large Networks

SimRank*: Effective and Scalable Pairwise Similarity Search Based on Graph Topology

Fast SimRank Computation over Disk-Resident Graphs.

Fast Incremental SimRank on Link-Evolving Graphs.

Dynamical SimRank Search on Time-Varying Networks

A Novel and Fast Simrank Algorithm

Fast Single-Pair SimRank Computation

All-Pairs SimRank Updates on Dynamic Graphs

More is simpler: effectively and efficiently assessing node-pair similarities based on hyperlinks

Hierarchical All-Pairs SimRank Calculation.

Taming Computational Complexity: Efficient and Parallel Simrank Optimizations on Undirected Graphs

Fast Approximate CoSimRanks via Random Projections

An efficient similarity search framework for SimRank over large dynamic graphs

Efficient SimRank-based Similarity Join over Large Graphs.

Parallel SimRank Computation on Large Graphs with Iterative Aggregation

Fast top-k similarity join for SimRank.

Exact Single-Source SimRank Computation on Large Graphs

Scalable Single-Source SimRank Computation for Large Graphs.

Using Graphics Processors for High Performance Simrank Computation

Local Methods for Estimating SimRank Score