Bahman Bahmani, Kaushik Chakrabarti, and Dong Xin
In this paper, we design a fast MapReduce algorithm for Monte Carlo approximation of personalized PageRank vec- tors of all the nodes in a graph. The basic idea is very efficiently doing single random walks of a given length start- ing at each node in the graph. More precisely, we design a MapReduce algorithm, which given a graph G and a length λ, outputs a single random walk of length λ starting at each node in G. We will show that the number of MapReduce iterations used by our algorithm is optimal among a broad family of algorithms for the problem, and its I/O efficiency is much better than the existing candidates. We will then show how we can use this algorithm to very efficiently ap- proximate all the personalized PageRank vectors. Our em- pirical evaluation on real-life graph data and in production MapReduce environment shows that our algorithm is signif- icantly more efficient than all the existing algorithms in the MapReduce setting.
|Published in||ACM SIGMOD Conference|