Performance predictability is a key requirement for high-performant applications in today's multi-tenant data centers. Online services running in infrastructure data centers need such predictability to satisfy application SLAs. Cloud data centers require guaranteed performance to bound customer costs and spur adoption. However, several components of today’s datacenters are at odds with such high-level application SLAs.
The Predictable Data Centers (PDC) project tackles the issue of unpredictable application performance in data centers. A key contributor to such unpredictability is shared resources like network and storage. The bandwidth across the cloud network and to the cloud storage service can vary significantly. To address this, we are designing a predictable data center architecture that offers performance SLAs across shared resources. Efforts like Oktopus, D3 and Hadrian enable a predictable network. More recently, we have been working on predictable storage. We have designed IOFlow, a software-defined storage architecture that enables performance SLAs across shared storage.
Network Sharing in Multi-tenant Data Centers
Bridging the Tenant-Provider Gap in Cloud Services
Towards Predictable Datacenter Networks
Meeting Deadlines in Data Center Networks
- Ioan Stefanovici, Eno Thereska, Greg O'Shea, Bianca Schroeder, Hitesh Ballani, Thomas Karagiannis, Ant Rowstron, and Tom Talpey, Software-Defined Caching: Managing Caches in Multi-Tenant Data Centers, in ACM Symposium on Cloud Computing (SOCC) 2015, ACM – Association for Computing Machinery, 27 August 2015.
- Keon Jang, Justine Sherry, Hitesh Ballani, and Toby Moncaster, Silo: Predictable Message Latency in the Cloud, in SIGCOMM, ACM – Association for Computing Machinery, August 2015.
- Sebastian Angel, Hitesh Ballani, Thomas Karagiannis, Greg O'Shea, and Eno Thereska, End-to-end Performance Isolation through Virtual Datacenters, in OSDI'14: The 11th USENIX Symposium on Operating Systems Design and Implementation, USENIX – Advanced Computing Systems Association, October 2014.
- Fahad R Dogar, Thomas Karagiannis, Hitesh Ballani, and Ant Rowstron, Decentralized Task-aware Scheduling for Data Center Networks, in SIGCOMM, ACM, August 2014.
- Eno Thereska, Hitesh Ballani, Greg O'Shea, Thomas Karagiannis, Ant Rowstron, Tom Talpey, Richard Black, and Timothy Zhu, IOFlow: A Software-Defined Storage Architecture, in SOSP'13: The 24th ACM Symposium on Operating Systems Principles, ACM, November 2013.
- Keon Jang, Justine Sherry, Hitesh Ballani, and Toby Moncaster, Silo: Predictable Message Completion Time in the Cloud, no. MSR-TR-2013-95, September 2013.
- Fahad Dogar, Thomas Karagiannis, Hitesh Ballani, and Ant Rowstron, Decentralized Task-Aware Scheduling for Data Center Networks, no. MSR-TR-2013-96, September 2013.
- Hitesh Ballani, Keon Jang, Thomas Karagiannis, Changhoon Kim, Dinan Gunawardena, and Greg O'Shea, Chatty Tenants and the Cloud Network Sharing Problem, in USENIX Symposium on Networked Systems Design and Implementation, NSDI, April 2013.
- Virajith Jalaparti, Hitesh Ballani, Paolo Costa, Thomas Karagiannis, and Ant Rowstron, Bridging the Tenant-Provider Gap in Cloud Services , in ACM Symposium on Cloud Computing, SoCC, October 2012.
- Hitesh Ballani, Paolo Costa, Thomas Karagiannis, and Ant Rowstron, The Price Is Right: Towards Location-Independent Costs in Datacenters, ACM HotNets, November 2011.