Systems Research Group (Asia)
- Live Programming
Programming today involves editing code while also running it in our head. To augment this mental simulation, live programming promises for much more fluid feedback between the programmer and a program that is executing while it is being edited.
- MadLINQ: Large-Scale Distributed Matrix Computation for the Cloud
The computation core of many data-intensive applications can be best expressed as matrix computations. The MadLINQ project addresses the following two important research problems: the need for a highly scalable, efficient and fault-tolerant matrix computation system that is also easy to program, and the seamless integration of such specialized execution engines in a general purpose data-parallel computing system.
- MODIST: Transparent Model Checking of Unmodified Cloud Systems
MODIST is a practical software model checker for unmodified concurrent, distributed and cloud systems. MODIST explores different execution paths systematically as well as simulating a variety of environment faults to discover subtle corner-case defects. We have applied MODIST in Oracle Berkely DB, MPS(Paxos implementation), SQL Azure, Windows Azure Storage and other real systems, and found many new bugs.
Efficient tools are indispensable in the battle against software bugs. In this project, we aims to improve the debugging productivity that targets different phases of an interactive and iterative debugging session.
- PASS: Program Analysis for SCOPE Scripts
PASS project is a continuing collaboration with the Cosmos team that aims to improve SCOPE script correctness and performance using program analysis techniques, following the inter-disciplinary research direction, among program language, system and database research.
- Temporal Graph Storage and Analysis of Social Data
An explosion of user-generated data from online social networks motivates analysis to extract deep insights from this data's graph of social, temporal, spatial, and topical connections. We are building a system to enable storage and analysis of such graphs that considers their evolution over time as trending topics and social activities change.
- TimeStream: Large-Scale Real-Time Stream Processing in the Cloud
TimeStream is a distributed system designed specifically for low-latency continuous processing of big streaming data on a large cluster of commodity machines. The unique characteristics of this emerging application domain have led to a significantly different design from the popular MapReduce-style batch data processing. In particular, we advocate a powerful new abstraction called resilient substitution that caters to the specific needs in this new computation model.