The DC Genome Project is a joint project between Microsoft Research (MSR) and Microsoft Global Foundation Services (GFS). The goal of the project is to use data-driven and feedback control approaches to monitor, analyze, and improve data center operation efficiencies, to maximize data center capacity utilization, and to minimize their environmental impacts.
Genomotes are customized wireless sensor nodes for data center environmental sensing. They use IEEE 802.15.4 wireless radio for communication. For ease of deployment and reduction of the number of contending wireless nodes, we take a master-slave chained design. The master node is a wireless node, which also has a serial interface to communicate with the slave nodes. A slave node has two serial interfaces, one up chain and one down chain.
RACNet is the network among the wireless Genomotes for data collection. Wireless sensor networking faces significant challenges in a data center environment. The number of nodes in the communication neighborhood can be very large. In our experience, between 50% to 80% nodes can hear (interfere) each other. RACNet used multiple communication channels and a token passing mechanism to avoid congestion in the network. We achieve more than 99.5% data yield in production deployments.
Cypress Data Management
One direct consequence of taking a data-driven approach for data center management is to deal with the massive amount of data generated from sensors (including soft sensors such as application performance counters) and other information sources. Cypress is a compressive data management framework for time series streams. It decomposes time series into multiple compressed feature streams (called trickles). Trickles can be further grouped together to take advantage of spatial correlation for more compression. Common queries such as select, trend, histogram, and correlations can be answered directly from compressed trickles rather than from reconstructing the raw data.
Using the data collected from servers and their environments, we are looking at improving data center operation efficiency through static and dynamic server provisioning. RackPacker is a data-driven static provisioning approach by taking advantage of stationary and statistical variations of workload to improve provisioned power utilization. AutoShift is a dynamic provisioning approach to migrate workload to a minimum number of servers and turn off unnecessary servers. We use a seasonal time series regression technique for load prediction and dynamically skew the load to active servers (c.f. NSDI08 publication).
Joint Resource Control
The computing (cyber-) systems and the physical systems in a data center have their own distinct dynamics. A user request must be servers in milliseconds, while some facility components have a life time of over 15 years. How to organize across the nine-orders of magnitude is a great challenge for resource control purpose. We envision a holistic control framework where information and constraints are shared across the physical and computing boundaries to maximize energy saving potentials. For example, load balancers can be designed to give more load to the servers that can be easily cooled. Workload (and thus power) spikes can be clipped to protect UPS in an oversubscription environment. A critical component in this vision is the joint modeling of various dynamics (continuous time, discrete events, queueing, etc.) and a framework to analyze their interaction.
- Microsoft Global Foundation Services: Mike Manos, Daniel Costello, Amaya Souarez, Patrick Yantz, Jeff O'Reilly, Kelly Roark, Sean James, Christian Belady, Phil Suver, Charl Kunzmann
- Johns Hopkins University: Andreas Terzis
- Harbin Institute of Technology (China): Qiang Wang
- Intern Students: Gong Chen, Wenbo He, Mike Liang, Lakshmi Ganesh, Galen Reeves, Sorabh Gandhi
- Chieh-Jan Mike Liang, Kaifei Chen, Nissanka Bodhi Priyantha, Jie Liu, and Feng Zhao, RushNet: Practical Traffic Prioritization for Saturated Wireless Sensor Networks, in SenSys (Conference on Embedded Networked Sensor Systems), ACM – Association for Computing Machinery, November 2014.
- Li Zhao, Jacob Brouwer, Jie Liu, Sean James, John Siegler, Aman Kansal, and Eric Peterson, Fuel Cells for Data Centers: Power Generation Inches From the Server, no. MSR-TR-2014-37, March 2014.
- Ana Carolina Riekstin, Sean James, Aman Kansal, Jie Liu, and Eric Peterson, No More Electrical Infrastructure: Towards Fuel Cell Powered Data Centers, in 2013 Workshop on Power-Aware Computing and Systems , ACM, November 2013.
- Chieh-Jan Mike Liang, Kaifei Chen, Jie Liu, Nissanka Bodhi Priyantha, and Feng Zhao, Poster Abstract: Shipping Data from Heterogeneous Protocols on Packet Train, in IPSN (International Conference on Information Processing in Sensor Networks), ACM – Association for Computing Machinery, April 2012.
- Jie Liu and Andreas Terzis, Sensing Data Centers for Energy Efficiency, in Philosophical Transactions of The Royal Society A. January 13, 2012 370 1958 136-157, , 13 January 2012.
- Lei Li, Chieh-Jan Mike Liang, Jie Liu, Suman Nath, Andreas Terzis, and Christos Faloutsos, ThermoCast: A Cyber-Physical Forecasting Model for Data Centers, in KDD (ACM SIGKDD Conference on Knowledge Discovery and Data Mining), ACM – Association for Computing Machinery, August 2011.
- Jie Liu, Michel Goraczko, Sean James, Christian Belady, Jiakang Lu, and Kamin Whitehouse, The Data Furnace: Heating Up with Cloud Computing, in 3rd USENIX Workshop on Hot Topics in Cloud Computing, USENIX, June 2011.
- Chieh-Jan Mike Liang, Nissanka Bodhi Priyantha, Jie Liu, and Andreas Terzis, Surviving Wi-Fi Interference in Low Power ZigBee Networks, in SenSys (ACM Conference on Embedded Network Sensor Systems), Association for Computing Machinery, Inc., 2 November 2010.
- Jie Liu, Automatic Server to Circuit Mapping with The Red Pills, in 2010 Workshop on Power Aware Computing and Systems (HotPower '10), USENIX, 3 October 2010.
- Chieh-Jan Mike Liang, Jie Liu, Liqian Luo, Andreas Terzis, and Feng Zhao, RACNet: A High-Fidelity Data Center Sensing Network, in SenSys (ACM Conference on Embedded Network Sensor Systems), ACM – Association for Computing Machinery, November 2009.
- MIT Technology Review: Saving Energy in Data Centers
- Popular science (Posci.com): Microsoft Practices Sensor-ship
- KOMO4 News: Microsoft Research: From cells to solar system
- IDG News (appeared on PC World, Computer World, etc.): Microsoft shows off Data-center Monitoring System
- New Scientist: Delaying data could cut net's carbon footprint.