Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Toward Automatic Policy Refinement in Repair Services for Large Distributed Systems

Moises Goldszmidt, Mihai Budiu, Yue Zhang, and Michael Pechuk

Abstract

In order to be economically feasible and to offer high levels of availability and performance, large scale distributed systems depend on the automation of repair services. While there has been considerable work on mechanisms for such automated services, a framework fore evaluating and optimizing the policies governing such mechanisms has been lacking. In this paper we propose one such framework and report on our initial experience in applying the framework to analyze and optimize the operation of a geo-distributed cloud storage system at Microsoft.

Details

Publication typeInproceedings
Published inThe 3rd ACM SIGOPS International Workshop on Large Scale Distributed Systems and Middleware
> Publications > Toward Automatic Policy Refinement in Repair Services for Large Distributed Systems