Speaker Carlos Guestrin
Host Ofer Dekel
Affiliation University of Washington
Date recorded 26 October 2012
GraphLab: Large-scale Machine Learning on Natural Graphs
Unfortunately, implementing efficient parallel ML algorithms is challenging. Existing high-level parallel abstractions such as MapReduce and Pregel are insufficiently expressive to achieve the desired performance, while low-level tools such as MPI are difficult to use, leaving ML experts repeatedly solving the same design challenges. In this talk, I will also describe the GraphLab framework, which naturally expresses asynchronous, dynamic graph computations that are key for state-of-the-art ML algorithms. When these algorithms are expressed in our higher-level abstraction, GraphLab will effectively address many of the underlying parallelism challenges, including data distribution, optimized communication, and guaranteeing sequential consistency, a property that is surprisingly important for many ML algorithms. On a variety of large-scale tasks, GraphLab provides 20-100x performance improvements over Hadoop. In recent months, GraphLab has received thousands of downloads, and is being actively used by a number of startups, companies, research labs and universities.
This talk represents joint work with Yucheng Low, Joey Gonzalez, Aapo Kyrola, Jay Gu, Danny Bickson, and Joseph Bradley.
©2012 Microsoft Corporation. All rights reserved.