Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Quorum-Based Perfect Failure Detection Service

Wei Chen, Xuezheng Liu, Yunni Xia, and Lidong Zhou

Abstract

A failure detection service is perfect if it eventually detects all failures and every detection correctly identifies a failure that has already occurred. Such a perfect failure detection service serves as a basic building block for many reliable distributed systems, for example in primary/backup replication protocols and distributed lock services. In this paper, we present a comprehensive study on applying quorum systems to the perfect failure detection service in order to enhance the fault tolerance of the service. We provide the precise system model and specification for a quorum-based failure detection service. We prove that stable storage is necessary if the server processes may crash and recover in the middle of the service. We present two novel algorithms that implement the failure detection service and have complementary characteristics. We further develop a set of quality-of-service (QoS) metrics for quorum-based perfect failure detection services, and apply probabilistic analysis to quantify the QoS metrics of the two algorithms.

Details

Publication typeTechReport
NumberMSR-TR-2009-62
> Publications > Quorum-Based Perfect Failure Detection Service