To detect, infer, diagnose, and recover from faults in enterprise wired and wireless networks.NetHealth: is a network management research program in which end-hosts cooperatively detect, diagnose, and recover from network faults. Unlike existing products we take a end-host centric approach to gathering, aggregating, and analyzing data at all layers of the networking stack for determining the root cause of the problems. NetHealth includes several on-going projects in the wireless and wired space
Overview
Networks are being deployed extensively in large corporations, small offices, and homes. However, a significant number of ``pain points'' remain for end-users and network administrators. To resolve complaints quickly and efficiently, network administrators need tools that can assist them in detecting, isolating, diagnosing, and correcting faults. Furthermore, such tools should also detect network security breaches, possibly caused by innocent employees. The NetHealth project is about detecting, infering, diagnosing, and recovering from user perceived performance problems in enterprise networks.
Existing products do a reasonable job of presenting statistical data from the network. However, they do not do a comprehensive job of gathering and analyzing the data to establish the root cause of the problem. Furthermore, on the wireless side, most products gather data from the Access Points (APs) only and neglect the client-side view of the network. Some products that monitor the network from the client's perspective require hardware sensors, which can be expensive to deploy and maintain. Also, current solutions do not provide any support for disconnected clients even though these are the ones that need the most help. On the wired side, a number of researchers have come up with solutions for diagnosing problems over WANs; however, most of those approaches are not integrated to perform end-to-end inference and diagnostics.
Under the NetHealth umbrella, we are building algorithms and tools that
- allow generalist operators to diagnose end-to-end performance as “seen” by users
- produce near real-time and historical-analysis reports of end-to-end performance problems with networked services and components
- prioritize and raise alerts based on impact analysis on users from performance glitches/problems
- automatically resolve the problem or offer meaningful resolution strategies
- provide detailed analysis of wireless failures for mobile devices
- provide snapshots of the “health” of network elements and services
- compliment existing detailed networked diagnosis technologies
In contrast to traditional network-based and bolt-on approaches, NetHealth leverages clients and servers. NetHealth agents on the end systems are positioned to harvest available application data, and infer application-level dependencies, rather than reverse this information out from the network or from summarized logs and alerts from computing and network elements, and associated management systems. As a result, the NetHealth approach is well-suited for effective problem location and resolution, and for bringing together the intelligence needed to support meaningful resilience and self-healing, self-* capabilities.
Sub-Projects
- Sherlock - Enterprise network management via analysis of network dependencies
- Orion - Dependency extraction in enterprise networks
- DAIR - Enterprise wireless LAN management via Dense Array of Inexpensive Radios
- ELDA (SureMail) - Notification system when email losses are detected
- NetProfiler - Cooperative Network Monitoring & Diagnosis
Brainstorming Events
All talks, videos and presentation decks are avaialble on event's web site.
- Self Managing Networks Summit 2005 -- A 2-day mindswap event between reseachers from industry, academia, and government to discuss Self-aware networking. June 1-2, 2005
- EdgeNet 2006: Life at the Edge: Research and Practice in Corporate/Campus Networks -- This summit brought together experts in academia and industry to discuss the problems facing the designers and managers of enterprise networks. June 1-2, 2006
2010
- Sharad Agarwal, Nikitas Liogkas, Prashanth Mohan, and Venkat Padmanabhan, WebProfiler: cooperative diagnosis of Web failures, in COMSNETS, 5 January 2010
2009
- Lenin Ravindranath, Paramvir Bahl, Ranveer Chandra, David A. Maltz, Jitendra Padhye, and Parveen Patel, Change Is Hard: Adapting Dependency Graph Models For Unified Diagnosis in Wired/Wireless Networks, in Workshop: Research on Enterprise Networking, Association for Computing Machinery, Inc., 21 August 2009
2008
- Paramvir Bahl, Ranveer Chandra, Patrick P. C. Lee, Vishal Misra, Jitendra Padhye, Dan Rubenstein, and Yan Yu, Opportunistic Use of Client Repeaters to Improve Performance of WLANs, in ACM CoNEXT 2008 (Best Paper Award), Association for Computing Machinery, Inc., December 2008
- Victor Bahl, Ranveer Chandra, Patrick Lee, Vishal Misra, Jitendra Padhye, Dan Rubenstein, and Yan Yu, Opportunistic Use of Client Repeaters to Improve Performance of WLANs, no. MSR-TR-2008-149, October 2008
- Victor Bahl, Ranveer Chandra, Dave Maltz, Parveen Patel, Jitendra Padhye, and Lenin Ravindranath, Towards Unified Management of Networked Services in Wired and Wireless Networks, no. MSR-TR-2008-148, October 2008
- Srikanth Kandula, Ranveer Chandra, and Dina Katabi, Whats Going On? Learning Communication Rules in Edge Networks, in ACM SIGCOMM, Association for Computing Machinery, Inc., August 2008
- Rohan Murty, Jitendra Padhye, Ranveer Chandra, Alec Wolman, and Brian Zill, Designing High Performance Enterprise Wi-Fi Networks, in Networked Systems Design & Implementation (NSDI), USENIX, April 2008
2007
- Paramvir Bahl, Ranveer Chandra, Albert Greenberg, Srikanth Kandula, David Maltz, and Ming Zhang, Towards Highly Reliable Enterprise Network Services via Inference of Multi-level Dependencies, in SIGCOMM, Association for Computing Machinery, Inc., August 2007
- Sharad Agarwal, Venkat Padmanabhan, and Dilip Joseph, Addressing Email Loss with SureMail: Measurement, Design, and Evaluation, in Usenix Annual Technical Conference, USENIX, 17 June 2007
- Ranveer Chandra, Jitendra Padhye, Alec Wolman, and Brian Zill, A Location-Based Management System for Enterprise Wireless LANs, no. MSR-TR-2007-16, February 2007
2006
- Venkat Padmanabhan, Sriram Ramabhadran, Sharad Agarwal, and Jitu Padhye, A Study of End-to-End Web Access Failures, in CoNEXT, ACM, 4 December 2006
- Paramvir Bahl, Paul Barham, Richard Black, Ranveer Chandra, Moises Goldszmidt, Rebecca Isaacs, Srikanth Kandula, Lun Li, John MacCormick, David A. Maltz, Richard Mortier, Mike Wawrzoniak, and Ming Zhang, Discovering Dependencies for Network Management, in Workshop on Hot Topics in Networks (HotNets-V), Association for Computing Machinery, Inc., Irvine, California, November 2006
- Ranveer Chandra, Venkat Padmanabhan, and Ming Zhang, WiFiProfiler: Cooperative Diagnosis in Wireless LANs, in Mobile Systems, Applications, and Services (MobiSys), Association for Computing Machinery, Inc., June 2006
- Paramvir Bahl, Ranveer Chandra, Jitendra Padhye, Alec Wolman, and Brian Zill, Enhancing the Security of Corporate Wi-Fi Networks Using DAIR, in ACM/USENIX Mobile Systems, Applications, and Services (MobiSys), Association for Computing Machinery, Inc., June 2006
- Sharad Agarwal, Dilip Joseph, and Venkata N. Padmanabhan, Addressing Email Loss with SureMail: Measurement, Design, and Evaluation, no. MSR-TR-2006-67, May 2006
2005
- Sharad Agarwal, Venkat Padmanabhan, and Dilip Joseph, SureMail: Notification Overlay for Email Reliability, in HotNets IV, Association for Computing Machinery, Inc., 14 November 2005
- DAIR: A Framework for Managing Enterprise Wireless Networks Using Desktop Infrastructure, in ACM HotNets-IV, Association for Computing Machinery, Inc., November 2005
2004
- Atul Adya, Paramvir Bahl, Ranveer Chandra, and Lili Qiu, Architecture and Techniques for Diagnosing Faults in IEEE 802.11 Infrastructure Networks, in ACM MobiCom, Association for Computing Machinery, Inc., September 2004
- Karthik Lakshminarayanan, Venkata N. Padmanabhan, and Jitendra Padhye, Bandwidth Estimation in Broadband Access Networks, no. MSR-TR-2004-44, May 2004
2003
- Lili Qiu, Paramvir Bahl, Ananth Rao, and Lidong Zhou, Troubleshooting Multihop Wireless Networks, no. MSR-TR-2004-11, December 2003
Press
- Larry Greenemeier,InformationWeek, Inside Microsoft's Labs, December 04 , 2006
- Gary Anthes, Computerworld, The Future of E-mail, June 12, 2006
- Gary Anthes, Computerworld, Projects in the Microsoft Research labs, June 5, 2006
- Joris Evers, The Industry Standard, Microsoft Reseachers target worms, March 4, 2005
