Detecting Application-Level Failures in Component-based Internet Services
- Emre Kiciman ,
- Armando Fox
IEEE Transactions on Neural Networks: Special Issue on Adaptive Learning Systems in Communication Networks |
Most Internet services (e-commerce, search engines, etc.) suffer faults. Quickly detecting these faults can be the largest bottleneck in improving availability of the system. We present Pinpoint, a methodology for automatic fault detection in Internet services by (1) observing low-level, internal structural behaviors of the service; (2) modeling the majority behavior of the system as correct; and (3) detecting anomalies in these behaviors as possible symptoms of failures. Without requiring any a priori application-specific information, Pinpoint correctly detected 89-96% of major failures in our experiments, as compared to 20-70% detected by current application-generic techniques.
Copyright © 2007 IEEE. Reprinted from IEEE Computer Society. This material is posted here with permission of the IEEE. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to pubs-permissions@ieee.org. By choosing to view this document, you agree to all provisions of the copyright laws protecting it.