Qiang Fu, Jian-Guang Lou, Qingwei Lin, Rui Ding, Dongmei Zhang, Zihao Ye, and Tao Xie
Monitoring and diagnosing performance issues of an online service system are critical to assure satisfactory perfor-mance of the system. Given a detected performance issue and collected system metrics for an online service system, engi-neers usually need to make great efforts to conduct diagnosis by first identifying performance issue beacons, which are metrics that pinpoint to the root causes. In order to reduce the manual efforts, in this paper, we propose a new approach to effectively detecting performance issue beacons to help with performance issue diagnosis. Our approach includes techniques for mining system metric data to address limita-tions when applying previous classification-based approach-es. Our evaluations on both a controlled environment and a real production environment show that our approach can more effectively identify performance issue beacons from system metric data than previous approaches.
In the 31st International Symposium on Reliable Distributed Systems (SRDS’12)