Toward Automatic Policy Refinement in Repair Services for Large Distributed Systems

In order to be economically feasible and to offer high levels of availability and performance, large scale distributed systems depend on the automation of repair services. While there has been considerable work on mechanisms for such automated services, a framework fore evaluating and optimizing the policies governing such mechanisms has been lacking. In this paper we propose one such framework and report on our initial experience in applying the framework to analyze and optimize the operation of a geo-distributed cloud storage system at Microsoft.

policy.pdf
PDF file

In  The 3rd ACM SIGOPS International Workshop on Large Scale Distributed Systems and Middleware

Details

TypeInproceedings
> Publications > Toward Automatic Policy Refinement in Repair Services for Large Distributed Systems