# FAST-PPR: scaling personalized pagerank estimation for large graphs

### Abstract

We propose a new algorithm, FAST-PPR, for estimating personalized PageRank: given start node $s$ and target node $t$ in a directed graph, and given a threshold $\delta$, FAST-PPR estimates the Personalized PageRank $\pi_s(t)$ from $s$ to $t$, guaranteeing a small relative error as long $\pi_s(t)>\delta$. Existing algorithms for this problem have a running-time of $\Omega(1/\delta)$; in comparison, FAST-PPR has a provable average running-time guarantee of $O(\sqrt{ d/\delta})$ (where $d$ is the average in-degree of the graph). This is a significant improvement, since $\delta$ is often $O(1/n)$ (where $n$ is the number of nodes) for applications. We also complement the algorithm with an $\Omega(1/\sqrt{\delta})$ lower bound for PageRank estimation, showing that the dependence on $\delta$ cannot be improved.

We perform a detailed empirical study on numerous massive graphs, showing that FAST-PPR dramatically outperforms existing algorithms. For example, on the 2010 Twitter graph with 1.5 billion edges, for target nodes sampled by popularity, FAST-PPR has a 20 factor speedup over the state of the art. Furthermore, an enhanced version of FAST-PPR has a 160 factor speedup on the Twitter graph, and is at least 20 times faster on all our candidate graphs.

Type
Publication
KDD ‘14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining
##### Siddhartha Banerjee
###### Associate Professor

Sid Banerjee is an associate professor in the School of Operations Research at Cornell, working on topics at the intersection of data-driven decision-making, market design, and algorithms for large-scale networks.