Abstract:Given a large graph, how can we determine similarity between nodes in a fast and accurate way? Random walk with restart (RWR) is a popular measure for this purpose and has been exploited in numerous data mining applications including ranking, anomaly detection, link prediction, and community detection. However, previous methods for computing exact RWR require prohibitive storage sizes and computational costs, and alternative methods which avoid such costs by computing approximate RWR have limited accuracy. In this paper, we propose TPA, a fast, scalable, and highly accurate method for computing approximate RWR on large graphs. TPA exploits two important properties in RWR: 1) nodes close to a seed node are likely to be revisited in following steps due to block-wise structure of many real-world graphs, and 2) RWR scores of nodes which reside far from the seed node are proportional to their PageRank scores. Based on these two properties, TPA divides approximate RWR problem into two subproblems called neighbor approximation and stranger approximation. In the neighbor approximation, TPA estimates RWR scores of nodes close to the seed based on scores of few early steps from the seed. In the stranger approximation, TPA estimates RWR scores for nodes far from the seed using their PageRank. The stranger and neighbor approximations are conducted in the preprocessing phase and the online phase, respectively. Through extensive experiments, we show that TPA requires up to 3.5x less time with up to 40x less memory space than other state-of-the-art methods for the preprocessing phase. In the online phase, TPA computes approximate RWR up to 30x faster than existing methods while maintaining high accuracy.

Random walk with restart on hypergraphs: fast computation and an application to anomaly detection

Irwr: Incremental Random Walk With Restart

TPA: Fast, Scalable, and Accurate Method for Approximate Random Walk with Restart on Billion Scale Graphs

Frustrated Random Walks: A Fast Method to Compute Node Distances on Hypergraphs

Fast Flow-based Random Walk with Restart in a Multi-query Setting

Flow-Based Community Detection in Hypergraphs

Work-in-Progress: HeteroRW: A Generalized and Efficient Framework for Random Walks in Graph Analysis

Common Neighbors Matter: Fast Random Walk Sampling with Common Neighbor Awareness

GraphWalker: an I/O-Efficient and Resource-Friendly Graph Analytic System for Fast and Scalable Random Walks.

TiRGN: Time-Guided Recurrent Graph Network with Local-Global Historical Patterns for Temporal Knowledge Graph Reasoning.

INFERRING COMMUNITY STRUCTURE THROUGH MAXIMUM DEGREE-BASED RANDOM WALK WITH RESTART

Hippocluster: an efficient, hippocampus-inspired algorithm for graph clustering

Toward Fast and Scalable Random Walks over Disk-Resident Graphs Via Efficient I/O Management.

Walking with Perception: Efficient Random Walk Sampling via Common Neighbor Awareness

New Random Walk Algorithm Based on Different Seed Nodes for Community Detection

FastRW: A Dataflow-Efficient and Memory-Aware Accelerator for Graph Random Walk on FPGAs.

Fast and Accurate Anomaly Detection in Dynamic Graphs with a Two-Pronged Approach

Waddling Random Walk: Fast and Accurate Mining of Motif Statistics in Large Graphs

An Adaptive Random Walk Sampling Method on Dynamic Community Detection

PowerWalk: Scalable Personalized PageRank via Random Walks with Vertex-Centric Decomposition