Fast Personalized PageRank on MapReduce

Bahman Bahmani, Kaushik Chakrabarti, and Dong Xin

June 2011

In this paper, we design a fast MapReduce algorithm for Monte Carlo approximation of personalized PageRank vec- tors of all the nodes in a graph. The basic idea is very efficiently doing single random walks of a given length start- ing at each node in the graph. More precisely, we design a MapReduce algorithm, which given a graph G and a length λ, outputs a single random walk of length λ starting at each node in G. We will show that the number of MapReduce iterations used by our algorithm is optimal among a broad family of algorithms for the problem, and its I/O efficiency is much better than the existing candidates. We will then show how we can use this algorithm to very efficiently ap- proximate all the personalized PageRank vectors. Our em- pirical evaluation on real-life graph data and in production MapReduce environment shows that our algorithm is signif- icantly more efficient than all the existing algorithms in the MapReduce setting.

Publication type | Inproceedings |

Published in | ACM SIGMOD Conference |

- Large Multidimensional Datasets inside a Database System (Ph.D. Thesis)
- Locally Adaptive Dimensionality Reduction for Indexing Large Time Series Databases
- Supporting Similarity Queries in MARS

> Publications > Fast Personalized PageRank on MapReduce