Improving Storage System Availability with D-GRAID

We present the design, implementation, and evaluation of D-GRAID, a gracefully-degrading and quickly-recovering RAID storage array. D-GRAID ensures that most files within the file system remain available even when an unexpectedly high number of faults occur. D-GRAID achieves high availability through aggressive replication of semantically critical data, and fault-isolated placement of logically related data. D-GRAID also recovers from failures quickly, restoring only live file system data to a hot spare. Both graceful degradation and fault-isolated placement are implemented in a prototype SCSI-based storage system underneath unmodified file systems, demonstrating that powerful ``file-system like'' functionality can be implemented within a ``semantically-smart'' disk system behind a narrow block-based interface.

In  ACM Transactions on Storage (TOS)

Publisher  Association for Computing Machinery, Inc.
