Pelican: A building Block for Exascale Cold Data Storage

11th USENIX Symposium on Operating Systems Design and Implementation (OSDI '14) |

Publication

A significant fraction of data stored in cloud storage is rarely accessed. This data is referred to as cold data; cost-effective storage for cold data has become a challenge for cloud providers. Pelican is a rack-scale hard disk based storage unit designed as the basic building block for exabyte scale storage for cold data. In Pelican, server, power, cooling and interconnect bandwidth resources are provisioned by design to support cold data workloads; this right-provisioning significantly reduces Pelican’s total cost of ownership compared to traditional disk-based storage.

Resource right-provisioning in Pelican means only 8% of the drives can be concurrently spinning. This introduces complex resource management to be handled by the Pelican storage stack. Resource restrictions are expressed as constraints over the hard drives. The data layout and IO scheduling ensures that these constraints are not violated. We evaluate the performance of a prototype Pelican, and compare against a traditional resource overprovisioned storage rack using a cross-validated simulator. We show that compared to this over-provisioned storage rack Pelican performs well for cold workloads, providing high throughput with acceptable latency.