Cloud Computing and Storage
MSR Technologies

Lead by Jin Li, Cloud Computing and Storaging (CCS) group consists a team of researchers that are passionate about end-to-end research. They share a common belief that the ultimate milestone of cool system research is a product of significant impact. In addition to pursue original research and publishing papers in premier venues, they devote significant time and go the extra miles to work with product group and other partners to transfer the research into products.

CCS has assisted Azure to architect and implement the local reconstruction code (LRC) used in Windows Azure Storage. This is a new family of erasure codes that provide significant reduction in storage overhead and cut down the minimum number of fragments that need to be read to reconstruct a data fragment. It leads to hundreds of millions dollars of savings for Microsoft, a Best Paper Award at USENIX ATC 2012 and a 2013 Microsoft Technical Community Network Storage Technical Achievement Award. CCS has also architected the erasure code used in Storage Spaces in Windows 8.1 and Windows Server 2012 R2. Also, it has architected and implemented the erasure coding used in Lync, Xbox and RemoteFX.

CCS has assisted Windows File Server group to architect and implement the Primary Data Deduplication feature in Windows Server 2012 [Paper] and End-to-End Deduplication for Storage Virtualization in Windows Server 2012 R2. Key contributions include a new data chunking algorithm, a low RAM footprint indexing data structure to detect duplicate data (based on ChunkStash), and a data partitioning and reconciliation technique, the latter two for scaling index resource usage with data size. It leads to major saving to customers (20-82%), and is among top 3 features for Windows File Server introduced at Windows Server 2012. It has received rave reviews ( The Register, IT Pro, Arts Technica, IT World, Tech Republic ), and there are evidence that some customers upgrading to WIndows Server 2012 for the primary data deduplication feature only.

CCS has exploited the benefits of Solid State Drive (SSD) for storage applications. "FlashStore" has implemented a SSD optimized, low RAM footprint key-value store that organizes storage on flash in a log-structured manner. It was techtransferred to Pegasus SSD in Microsoft backend. SkimpyStash has implemented an ultra-low RAM footprint key-value store. The storage design of SkimpyStash has been incorporated into BW-Tree, a joint project among CCS, MSR Database group, and Azure DocumentDB team. BW-tree is shipping in SQL Server 2014 (Hekaton) and Azure DocumentDB.

Partner with the Remote Desktop Virtualization (RDV) team, CCS has also assisted to archiect and implement the RemoteFX for WAN feature in Windows 8 and Windows Server 2012, which provides fast and fluid user experience in a remote session running over any WAN and wireless networks [Paper].

CCS has recently expanded its research scope to cloud computing platform. It hopes to revolutionarize how people program a distributed cluster today, and makes highly efficient, fine-grain distributed programming assessible to the mass.