MODIST is a practical software model checker for unmodified concurrent, distributed and cloud systems. MODIST explores different execution paths systematically as well as simulating a variety of environment faults to discover subtle corner-case defects. We have applied MODIST in Oracle Berkely DB, MPS(Paxos implementation), SQL Azure, Windows Azure Storage and other real systems, and found many new bugs.
This project re-imagines and re-engineers wide area networks, to more than double their efficiency and allow flexible sharing of resources.
Crowded: Digital Piecework and the Politics of Platform Responsibility in Precarious Times looks as crowdsourcing as a focal point for many of the issues that are raised by the structure of our current information economy: economic value, cultural meaning, and ethics.
Labs: New England
Scalable and Practical App Digging Engine
Buildings have a tremendous impact on our society, resource footprint, environment and health. In the past few years, there have been significant advances in gaining visibility into buildings’ daily operations. As human activities and comfort impact how buildings operate, we argue that the next step is Human-Building Analytics (HBA). Specifically, HBA aims for analytics to be easier and more personalized to individual occupants, and buildings to be more adaptive to occupants’ behavior.
-- Making it easy for app developers of all levels to test their apps under real-world contexts on the cloud or real devices --
Waypoint project is currently under wraps but this site will be updated in October when things go live.
As PHY layer data rates increase, CSMA MAC overheads dominate. The 9 us slot width at 1Gbps data rate can result in MAC efficiency of under 10%. WiFi-Nano proposes a novel speculative transmission based technique that leverages self-interference cancelation and allows for using 800ns slots -- reducing CSMA overheads by an order of magnitude.
The quest for higher data rates in WiFi is leading to the development of standards that make use of wide channels (e.g., 40MHz in 802.11n and 80MHz in 802.11ac). We argue against this trend of using wider channels, and instead advocate that radios should communicate concurrently over multiple narrow channels for efficient and fair spectrum utilization. We propose WiFi-NC, a novel PHY-MAC design that allows radios to use WiFi over multiple narrow channels simultaneously.
Dhwani enables information theoretically secure Near Field Communication (NFC) on existing mobile phones without requiring any special hardware or PKI infrastructure. It uses existing microphones and speakers on phones to perform acoustic NFC.
The LKW project is aimed at designing low-power algorithms and systems for admission control to speech systems: i.e., detecting foreground speech, recognizing leading keywords and verifying speakers on a continuously-on wearable device. Our goal is to consume under 10 mW average on generic embedded hardware available today and under 100uW on custom hardware.
In data centers, the IO path to storage is long and complex. It comprises many layers or “stages” with opaque interfaces between them. This makes it hard to enforce end-to-end policies that dictate a storage IO flow’s performance (e.g., guarantee a tenant’s IO bandwidth) and routing (e.g., route an untrusted VM’s traffic through a sanitization middlebox). We are researching architectures that decouple control from data flow to enable such policies.
The Scalable Hyperlink Store is a specialized "database" for the web graph. SHS maintains the web graph in main memory, distributed over many machines. The system is available as C# source code as well as precompiled binaries.
A framework to reason about weaker forms of consistency and isolation in a replicated database.
Energy drain in mobile devices is well recognized to be a serious problem. One solution is to provide tools and guidelines to enable application writers build more energy efficient programs. This project explores an alternative that mitigates the ill-effects of an energy hungry application. Our system, E-Loupe, offers a finer-grained approach to ensure predictable energy drain in mobile devices.
Vision is the ultimate source of sensory input that we humans consume. We believe that the next generation of computers will provide the ability to continuously capture and analyze visual information in real-time, thus greatly enhancing the overall experience and efficiency of their users.
Column store technology can provide very substantial performance improvements on data warehousing workloads. This project investigated how to integrate columnar storage into SQL Server. The solution adopted was to add a new index type, columnstore index, that stores data column wise instead of row wise. Columnstore indexes first shipped in SQL Server 2012 and significant enhancements will be included in the next release.
This research project in MSR SVC aims to answer the following question: Can we allow programmers to write cloud applications as though they are accessing centralized, strongly consistent data while at the same time allowing them to specify their consistency/availability/performance (CAP) requirements in terms of service-level agreements (SLAs) that are enforced by the cloud storage system at runtime?
The XCG Lab Security and Cryptography teams do development, applied research, and theoretical research in the fields of systems security and cryptography. These teams include the Cryptography Research team, the Security & Cryptography team, and the Systems Incubation team.
Optimus is a framework for dynamically rewriting an execution plan graph in distributed data-parallel computing at runtime. It enables optimizations that require knowledge of the semantics of the computation, such as language customizations for domain-specific computations including matrix algebra. We address several problems arising in distributed execution including data skew, dynamic data re-partitioning, unbounded iterative computations, and fault tolerance.
Real-time information about businesses such as, the current occupancy and music levels, as well as the type or exact song playing now, can be important factors in the local search decision process. In this work, we propose to automatically crowdsource such rich, real time business metadata through user check-in events.
Embassies is a new model of client-side application delivery that keeps the client code minimal and secure, while pushing almost all functionality into the vendor-supplied applications. The code in this project implements the system described in the NSDI 2013 paper.
The goal of the Physical Analytics project, or Phytics, is to perform analytics on the physical actions of users.
Online advertising systems bring in direct revenue to companies like Microsoft and Google. Occasionally, anomalous system behavior of malicious user intent can affect the health of such systems. Therefore, such systems require prompt and precise analytics to predict, pre-empt, detect, and diagnose problems such as sudden drops in revenue or new styles of click-fraud occur. Towards this, we are working on a set of projects to analyze these complex systems and keep them running in a healthy manner
Hyder is a transactional indexed-record manager for shared flash. That is, it supports operations on indexed records and transaction operations that bracket the record operations. It is designed to run on a cluster of servers that have shared access to a large pool of network-addressable storage, which stores the indexed records as a multiversion log-structured database. Hyder's main feature is that it scales out without partitioning the database or application.