We are currently investigating various topics related to the correctness and performance of software systems, especially in the area of concurrent systems. We place a high value on producing tools and methodologies that can be used by software developers and researchers.
Artemis is a modular application designed for analyzing and troubleshooting the performance of large clusters running datacenter services. Artemis is composed of four modules: (1) distributed log collection and extraction, (2) a database storing the extracted data, (3) an interactive visualization tool for exploring the data, and (4) a plug-in interface (and a set of sample plug-ins) allowing users to implement data analysis tools.
Automatic Mutual Exclusion
This project is exploring a new concurrent programming model, Automatic Mutual Exclusion (AME). In contrast to lock-based programming, and to other programming models built over software transactional memory (STM), we arrange that all shared state is implicitly protected unless the programmer explicitly specifies otherwise. An AME program is composed from serializable atomic fragments. We include features allowing the programmer to delimit and manage the fragments to achieve appropriate program structure and performance. The resulting programming model makes it easier to write correct code than incorrect code. It favors correctness over performance for simple programs, while allowing advanced programmers the expressivity they need.
Coconut is a library for .net for working with matrix mathematics. Coconut is simple to use and presents a familiar interface to uses. However, all expressions are stored symbolically, which enables Coconut to provide a very powerful computation engine which exploits the symbolic nature of the expressions to agressively optimize and distribute the computation.
The goal of DryadLINQ is to make distributed computing on large compute cluster simple enough for ordinary programmers. DryadLINQ combines two important pieces of Microsoft technology: the Dryad distributed execution engine and the .NET Language Integrated Query (LINQ). Dryad provides reliable, distributed computing on thousands of servers for large-scale data parallel applications. LINQ enables developers to write and debug their applications in a SQL-like query language, relying on the entire .NET library and using Visual Studio. DryadLINQ is a simple, powerful, and elegant programming environment for writing large-scale data parallel applications running on large PC clusters.
We present a practical FPGA-based accelerator for solving Boolean Satisfiability problems (SAT). Unlike previous efforts for hardware accelerated SAT solving, our design focuses on accelerating the Boolean Constraint Propagation (BCP) part of the SAT solver, leaving the choices of heuristics such as branching order, restarting policy and learning and backtracking to software. Our novel approach uses an application-specific architecture instead of an instance-specific one to avoid time-consuming FPGA synthesis for each SAT instance. By careful pipelining and avoiding global signal wires, our design is able to achieve much higher clock frequency than that of previous work. Our co-processor can load SAT instances in milliseconds, can handle SAT instances with tens of thousands of variables and clauses using a single FPGA, and can easily be scaled-up to handle more clauses by using multiple FPGAs. Our evaluation using a cycle-accurate simulator shows that the FPGA co-processor can achieve 3.7-38.6x speed up on BCP compared with state-of-the-art software SAT solvers. Single FPGA implementation of the co-processor only consumes 7.16 watt power in the worst case, which is an order of magnitude lower than modern CPU.
The TLA Tools project aims to develop methods and tools that will permit engineers to apply formal specification and verification to high-level designs of concurrent algorithms and systems. In a previous incarnation, TLA+, a complete formal language for specifying and reasoning about concurrent systems, was developed along with an associated set of tools for parsing, pretty printing, and model checking TLA+ specifications. The model checker, TLC, accepts a large subclass of TLA+ specifications including many that people write of real systems. To address some of the limitations of TLC, we are currently working on BTLC, a SAT-based symbolic model checker for TLA+. We are currently investigating possible uses of TLA at Microsoft.