Theoretical Analysis for Communications-Induced Checkpointing Protocols with Rollback-Dependency Trackability

  • Sy-Yen Kuo ,
  • Jichiang Tsai ,
  • Yi-Min Wang

MSR-TR-98-13 |

In this paper, we give a theoretical analysis for communication-induced checkpointing protocols that ensure Rollback-Dependency Trackability (RDT). RDT is a property such that all dependencies between local checkpoints are on-line trackable by using the transitive dependency vector. Several important issues related to this problem are discussed in the context. First, we address some “impossibility” problems. We investigate the truthfulness of the common intuition in the literature, that if a protocol forces a checkpoint at a weaker condition then it must take at least as many forced checkpoints as a protocol that does at a stronger condition. The concept that there is a tradeoff between the number of forced checkpoints and the size of piggybacked control information is also overthrown by some counterexamples. Next, we demonstrate that there is no optimal on-line RDT protocol in terms of the number of forced checkpoints. Then some techniques for comparing protocols are proposed. It is interesting to note that these techniques can be exploited to compare many existing protocols in the literature. A hierarchy graph for comparing a family of RDT protocols is finally depicted to marshal the discussions in the context. Our results provide guidelines for designing and evaluating efficient communication-induced checkpointing protocols satisfying the RDT property.