Evolution and Future Directions of Large-Scale Storage and Computation Systems at Google
Jeffrey Dean (Google)
Underlying the many products and services offered by Google is a
collection of systems and tools that simplify the storage and
processing of large-scale data sets. These systems are intended to
work well in Google's computational environment of large numbers of
commodity machines connected by commodity networking hardware. Our
systems handle issues like storage reliability and availability in the
face of machine failures, and our processing tools make it relatively
easy to write robust computations that run reliably and efficiently on
thousands of machines. In this talk I'll highlight some of the
systems we have built and are currently developing, and discuss some
challenges and future directions for new systems.
About the speaker: Jeff joined Google in 1999 and is currently a Google Fellow in Google's Systems Infrastructure Group. He has co-designed/implemented five generations of Google's crawling, indexing, and query serving systems, and co-designed/implemented major pieces of Google's initial advertising and AdSense for Content systems. He is also a co-designer and co-implementor of Google's distributed computing infrastructure, including the MapReduce and BigTable systems, has worked on system software for statistical machine translation, and implemented a variety of internal and external developer tools.