Vijayan Prabhakaran, Lakshmi N. Bairavasundaram, Nitin Agrawal, Haryadi S. Gunawi, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau
Commodity file systems trust disks to either work or fail completely, yet modern disks exhibit more complex failure modes. We suggest a new fail-partial failure model for disks, which incorporates realistic localized faults such as latent sector errors and block corruption. We then develop and apply a novel failure-policy fingerprinting framework, to investigate how commodity file systems react to a range of more realistic disk failures. We classify their failure policies in a new taxonomy that measures their Internal RObustNess (IRON), which includes both failure detection and recovery techniques. We show that commodity file system failure policies are often inconsistent, sometimes buggy, and generally inadequate in their ability to recover from partial disk failures. Finally, we design, implement, and evaluate a prototype IRON file system, Linux ixt3, showing that techniques such as in-disk checksumming, replication, and parity greatly enhance file system robustness while incurring minimal time and space overheads.
|Published in||Proceedings of the twentieth ACM symposium on Operating Systems Principles (SOSP'05)|
|Publisher||Association for Computing Machinery, Inc.|
Copyright © 2007 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or firstname.lastname@example.org. The definitive version of this paper can be found at ACM’s Digital Library --http://www.acm.org/dl/.