Abstract:Personalized PageRank (PPR) is a critical measure of the importance of a node t to a source node s in a graph. The Single-Source PPR (SSPPR) query computes the PPR's of all the nodes with respect to s on a directed graph G with n nodes and m edges; and it is an essential operation widely used in graph applications. In this paper, we propose novel algorithms for answering two variants of SSPPR queries: (i) high-precision queries and (ii) approximate queries. For high-precision queries, Power Iteration (PowItr) and Forward Push (FwdPush) are two fundamental approaches. Given an absolute error threshold λ (which is typically set to as small as 10-8), the only known bound of FwdPush is O(m/λ), much worse than the O(m log 1/λ)-bound of PowItr. Whether FwdPush can achieve the same running time bound as PowItr does still remains an open question in the research community. We give a positive answer to this question. We show that the running time of a common implementation of FwdPush is actually bounded by O(m · log 1/λ). Based on this finding, we propose a new algorithm, called Power Iteration with Forward Push (PowerPush), which incorporates the strengths of both PowItr and FwdPush. For approximate queries (with a relative error ε), we propose a new algorithm, called SpeedPPR, with overall expected time bounded by $O(n · log n · log 1/ε) on scale-free graphs. This improves the state-of-the-art O((n · log n)/ε) bound. We conduct extensive experiments on six real datasets. The experimental results show that PowerPush outperforms the state-of-the-art high-precision algorithm BePi by up to an order of magnitude in both efficiency and accuracy. Furthermore, our SpeedPPR also outperforms the state-of-the-art approximate algorithm FORA by up to an order of magnitude in all aspects including query time, accuracy, pre-processing time as well as index size.

High-Efficiency P-Rank Computation Through Asynchronous Accumulative Updates in Big Data Environment

Asynchronous Page-Rank Computation in Spark.

On the efficiency of estimating penetrating rank on large graphs

Partial Sums-Based P-Rank Computation in Information Networks

Evaluating Large Graph Processing in MapReduce Based on Message Passing

Combination of in-memory graph computation with mapreduce: a subgraph-centric method of pagerank

Progressive online aggregation in a distributed stream system

Personalized PageRank on Evolving Graphs with an Incremental Index-Update Scheme

Distributed PageRank Computation Based on Iterative Aggregation-Disaggregation Methods.

Enhancing HNSW Index for Real-Time Updates: Addressing Unreachable Points and Performance Degradation

Unifying the Global and Local Approaches: an Efficient Power Iteration with Forward Push

On Efficient Feature Ranking Methods for High-Throughput Data Analysis

Efficient Algorithms for Personalized PageRank Computation: A Survey

Semi-supervised Ranking on Very Large Graphs with Rich Metadata

Assessing Single-Pair Similarity over Graphs by Aggregating First-Meeting Probabilities

IMPROVEMENT AND APPLICATION OF PAGERANK ALGORITHM FOR MICRO-BLOG

Two Accelerated Non-backtracking PageRank Algorithms for Large-scale Networks

PowerWalk: Scalable Personalized PageRank via Random Walks with Vertex-Centric Decomposition

An Adaptive Method for the Efficient Similarity Calculation

Taming Computational Complexity: Efficient and Parallel Simrank Optimizations on Undirected Graphs

Efficient Algorithm for Computing Link-Based Similarity in Real World Networks