Speaker Ariel Rabkin
Affiliation Princeton University
Host Jay Lorch
Date recorded 3 March 2014
We are now entering an era in which organizations collect and process unprecedented data volumes. This "big data" is handled using large-scale distributed systems of unprecedented scale. My work addresses problems in effectively deploying and managing these systems. In this talk, I will focus on two parts of my research. First, I describe my work building wide-area data collection and analytics pipelines to cope with large and variable bandwidth demands. Second, I describe my work on better configuration management for the increasingly complex software seen in these environments.
In wide area contexts, available bandwidth can vary over time. Current analytics systems require users to specify in advance the data to be collected. As a consequence, systems are provisioned for the worst case, which is costly and inflexible. We are building a distributed analytics system, JetStream, designed for the wide area. JetStream lets users specify explicit policy for how the system should respond to varying data volumes and bandwidth availability. As a result, the system can make optimal use of available resources at each point in time.
As data grows, so does the complexity of the software used to manage it. Modern software stacks are increasingly complex and correspondingly difficult to configure. Users and administrators are left resorting to trial and error or internet searching when difficulties arise. My research in this area tames system configuration by applying static analysis. Analysis can determine the dependencies between configuration options and error messages. As a result, system failures can be quickly traced to a small set of potentially responsible options. Users thus get immediate feedback on how to resolve configuration errors.
©2014 Microsoft Corporation. All rights reserved.