System research group engages in fundamental system research that spans and bridges both theory and practice. Our charter is to design and build state-of-art and future systems that enrich computing and social experiences, enable scientific discoveries, and empower people to realize their full potentials. We focus on large-scale systems that power internet services and cloud computing, tools for building distributed systems, and new system architectures for emerging hardware and applications.
Distributed Systems and Large Scale Data Processing
In this area, our interest mainly focuses on the design and implementation of distributed storage systems, large-scale parallel data processing systems, debugging and verification mechanisms in such systems, as well as Peer-to-Peer systems and social networking systems. We have committed to understanding and solving system problems that arise from both the real world and from the experimental systems, ranging from protocols and algorithms to architectures and services involved in such systems. Besides publishing in leading researching conference and journals, we have also worked closely with product groups to help them build better systems.
WiDS: Distributed System Tools
No work can be done easily without handy tools. This is particularly true for system developers who build large-scale distributed systems, because these systems exhibit complex behaviors and suffer from subtle and yet serious bugs; manually detecting and removing all those bugs is hardly practical or feasible.
WiDS has been our long-time research investment that focuses on distributed system tools that span the system development and debugging cycles. Our goal is to build tools that help developers attack bugs more effectively and comprehensively, leading to systems that are more robust, reliable, and efficient.
We have built a bunch of tools using various debugging technologies, such as static analysis, model checking, and predicate checking. For example, MoDist is a transparent model checker that can systematically explore execution paths for a distributed system, effectively revealing bugs at corner cases; R2/iTarget is a lightweight recording and replay tool that can replay an execution deterministically and faithfully, enabling “time-travel debugging” for root cause analysis; D3S/CloudMeter enables predicate checking on a large, deployed system, allowing people to analyze its behaviors and detect anomaly with ease.
Emerging Hardware and System Architecture
In the System Research Group, we have been closely following the emerging computer architecture trends and are actively pursuing research directions to understand and leverage new hardware to improve the performance and reliability of computer systems. Our current projects encompass research on a wide range of new and exciting technologies. Some of the on-going projects include designing multi-core friendly operating systems; implementing next generation solid state storage systems and investigating their implications on OSes and user applications; leveraging multi-core microprocessor to accelerate program analysis and debugging; using reconfigurable hardware to accelerate database and Boolean Satisfiability (SAT) solving; and using Graphic Processing Unit (GPU) to accelerate the general-purpose computation".
- Bingsheng He, Mao Yang, Zhenyu Guo, Rishan Chen, Wei Lin, Bing Su, Hongyi Wang, and Lidong Zhou, Wave Computing in the Cloud, in HotOS, USENIX, April 2009
- Ruini Xue, Xuezheng Liu, Ming Wu, Zhenyu Guo, Wenguang Chen, Weimin Zheng, Zheng Zhang, and Geoffrey M. Voelker, MPIWiz: Subgroup Reproducible Replay of MPI Applications, no. MSR-TR-2008-127, September 2008
- Wei Lin, Mao Yang, Lintao Zhang, and Lidong Zhou, PacificA: Replication in Log-Based Distributed Storage Systems, no. MSR-TR-2008-25, February 2008
- Xuezheng Liu, Zhenyu Guo, Xi Wang, Feibo Chen, Xiaochen Lian, Jian Tang, Ming Wu, M. Frans Kaashoek, and Zheng Zhang, D3S: Debugging Deployed Distributed Systems, Association for Computing Machinery, Inc., February 2008
- Zhenyu Guo, Xi Wang, Xuezheng Liu, Wei Lin, and Zheng Zhang, BOX: Icing the APIs, no. MSR-TR-2008-03, January 2008
- Zhenyu Guo, Xi Wang, Xuezheng Liu, Wei Lin, and Zheng Zhang, Towards Pragmatic Library-based Replay, no. MSR-TR-2008-02, January 2008
- Yu Chen and Wei Chen, Decentralized, Connectivity-Preserving, and Cost-Effective Structured Overlay Maintenance, Springer-Verlag, November 2007
- Ming Chen, Wei Chen, Likun Liu, and Zheng Zhang, An Analytical Framework and Its Applications for Studying Brick Storage Reliability , IEEE Computer Society, October 2007
- Mao Yang, Qinyuan Feng, Yafei Dai, and Zheng Zhang, A Multi-dimensional Reputation System Combined with Trust and Incentive Mechanisms in P2P File Sharing Systems, IEEE Computer Society, June 2007
- Ming Chen, Lex Stein, and Zheng Zhang, Dependability, Access Diversity, Low Cost: Pick Two, USENIX, June 2007
- Qiao Lian, Zheng Zhang, Mao Yang, Ben Y. Zhao, Yafei Dai, and Xiaoming Li, An Empirical Study of Collusion Behavior in the Maze P2P File-Sharing System, IEEE Computer Society, June 2007
- Wei Chen, Jialin Zhang, Yu Chen, and Xuezheng Liu, Failure Detectors and Extended Paxos for k-Set Agreement, no. MSR-TR-2007-48, May 2007
- Wei Chen, Jialin Zhang, Yu Chen, and Xuezheng Liu, Partition Approach to Failure Detectors for k-Set Agreement, no. MSR-TR-2007-49, May 2007
- Xuezheng Liu, Wei Lin, Aimin Pan, and Zheng Zhang, WiDS Checker: Combating Bugs in Distributed Systems, USENIX, April 2007
- Lex Stein, David Holland, Margo Seltzer, and Zheng Zhang, Can a File System Virtualize Processors?, Association for Computing Machinery, Inc., March 2007
- Shuo Tang, Yu Chen, and Zheng Zhang, Machine Bank: Own Your Virtual Personal Computer, IEEE Computer Society, March 2007
- Bin Cheng, Xuezheng Liu, Zhengyou Zhang, and Hai Jin, A Measurement Study of a Peer-to-Peer Video-on-Demand System, February 2007
- Chun Yuan, Ni Lao, Ji-Rong Wen, Jiwei Li, Zheng Zhang, Yi-Min Wang, and Wei-Ying Ma, Automated Known Problem Diagnosis with Event Traces, Association for Computing Machinery, Inc., April 2006
- Qiao Lian, Peng Yu, Mao Yang, Zheng Zhang, Yafei Dai, and Xiaoming Li, Robust Incentives via Multi-level Tit-for-tat, February 2006
- Wei Chen and Xuezheng Liu, Enforcing Routing Consistency in Structured Peer-to-Peer Overlays: Should We and Could We?, February 2006
- Zheng Zhang, Qiao Lian, Shiding Lin, Wei Chen, Yu Chen, and Chao Jin, BitVault: a Highly Reliable Distributed Data Retention Platform, Association for Computing Machinery, Inc., December 2005
- Wei Chen, Shiding Lin, Qiao Lian, and Zheng Zhang, Sigma: A Fault-Tolerant Mutual Exclusion Algorithm in Dynamic Distributed Systems Subject to Process Crashes and Memory Losses, Institute of Electrical and Electronics Engineers, Inc., December 2005
- Shiding Lin, Aimin Pan, Rui Guo, and Zheng Zhang, Simulating Large-Scale P2P Systems with the WiDS Toolkit, Institute of Electrical and Electronics Engineers, Inc., September 2005
- Qiao Lian, Wei Chen, Zheng Zhang, Shaomei Wu, and Ben Y. Zhao, Z-Ring: Fast Prefix Routing via a Low Maintenance Membership Protocol, Institute of Electrical and Electronics Engineers, Inc., August 2005
- Qiao Lian, Wei Chen, and Zheng Zhang, On the Impact of Replica Placement to the Reliability of Distributed Brick Storage System, Institute of Electrical and Electronics Engineers, Inc., June 2005
- Shiding Lin, Aimin Pan, Zheng Zhang, Rui Guo, and Zhenyu Guo, WiDS: an Integrated Toolkit for Distributed System Development, USENIX, June 2005
- Qiao Lian, Wei Chen, and Zheng Zhang, On the impact of replica placement to the reliability of distributed brick storage systems, no. MSR-TR-2005-71, June 2005
- Mao Yang, Zhengyou Zhang, Xiaoming Li, and Yafei Dai, An Empirical Study of Free-Riding Behavior in the Maze P2P File-Sharing System, February 2005
- Mao Yang, Hua Chen, Ben Y. Zhao, Yafei Dai, and Zhengyou Zhang, Deployment of a Large-scale Peer-to-Peer Social Network, USENIX, December 2004
- Zheng Zhang, Qiao Lian, and Yu Chen, XRing: Achieving High-Performance Routing Adaptively in Structured P2P, no. MSR-TR-2004-93, September 2004
- Zheng Zhang, Yu Chen, Shi-Ding Lin, Bo-Ying Lu, Shu-Ming Shi, Xing Xie, and Chun Yuan, P2P Resource Pool and Its Application to Optimize Wide-Area Application Level Multicasting, August 2004
- Zheng Zhang, P2P Research and Reality: Some Preliminary Thoughts, Institute of Electrical and Electronics Engineers, Inc., July 2004
- Zheng Zhang, Mallik Mahalingam, Zhichen Xu, and Wenting Tang, Scalable, Structured Data Placement over P2P Storage Utilities, Institute of Electrical and Electronics Engineers, Inc., May 2004
- Zheng Zhang, The Power of DHT as a Logical Space, Institute of Electrical and Electronics Engineers, Inc., May 2004
- Zheng Zhang, Shiding Lin, Qiao Lian, and Chao Jin, RepStore: A Self-Managing and Self-Tuning Storage Backend with Smart Bricks, Institute of Electrical and Electronics Engineers, Inc., May 2004
- Shi-Ding Lin, Qiao Lian, Ming Chen, and Zheng Zhang, A Practical Distributed Mutual Exclusion Protocol in Dynamic Peer-to-Peer Systems, Springer-Verlag, February 2004
- Helen J. Wang, Yih-Chun Hu, Chun Yuan, Zheng Zhang, and Yi-Min Wang, Friends Troubleshooting Network: Towards Privacy-Preserving, Automatic Troubleshooting, Springer-Verlag, February 2004
- Yizhou Lu, Xuezheng Liu, Wensi Xi, Benyu Zhang, Hua Li, Zheng Chen, Shuicheng Yan, and Wei-Ying Ma, Efficient pagerank with same out-link groups, in 2004 Asia Information Retrieval Symposium, January 2004
- Yi-Min Wang, Chad Verbowski, John Dunagan, Yu Chen, Helen J. Wang, Chun Yuan, and Zheng Zhang, STRIDER: A Black-box, State-based Approach to Change and Configuration Management and Support, October 2003
- Zheng Zhang, Xing Xie, Shiding Lin, and Boying Lu, Optimizing Wide-Area Application Level Multicasting using P2P Resource Pool, no. MSR-TR-2003-36, June 2003
- Chun Yuan, Zhigang Hua, and Zheng Zhang, Proxy+: Simple Proxy Augmentation for Dynamic Content Processing, May 2003
- Chun Yuan, Yu Chen, and Zheng Zhang, Evaluation of Edge Caching/Offloading for Dynamic Content Delivery, May 2003
- Chi Zhang and Zheng Zhang, Trading Replication Consistency for Performance and Availability: an Adaptive Approac, Institute of Electrical and Electronics Engineers, Inc., May 2003
- Zhichen Xu, Chunqiang Tang, and Zheng Zhang, Building Topology-Aware Overlays using Global Soft-State, Institute of Electrical and Electronics Engineers, Inc., May 2003
- Zheng Zhang, Shu-Ming Shi, and Jing Zhu, SOMO: Self-Organized Metadata Overlay for Resource Management in P2P DHT, Springer-Verlag, February 2003
- Zheng Zhang, Shu-Ming Shi, and Jing Zhu, Self-Balanced P2P Expressway: When Marxism Meets Confucian, no. MSR-TR-2002-72, July 2002
- Zheng Zhang, Xing Xie, Boying Lu, and Shiding Lin, Enabling rich content service on the edge: opportunities and challenges, no. MSR-TR-2002-71, July 2002
- Zheng Zhang and Qiao Lian, Reperasure: Replication Protocol using Erasure-code, no. MSR-TR-2002-59, June 2002



