Performance Issue Diagnosis for Online Service Systems

Qiang Fu,Jian-Guang Lou,Qing-Wei Lin,Rui Ding,Dongmei Zhang,Zihao Ye,Tao Xie
DOI: https://doi.org/10.1109/SRDS.2012.49
2012-01-01
Abstract:Monitoring and diagnosing performance issues of an online service system are critical to assure satisfactory performance of the system. Given a detected performance issue and collected system metrics for an online service system, engineers usually need to make great efforts to conduct diagnosis by first identifying performance issue beacons, which are metrics that pinpoint to the root causes. In order to reduce the manual efforts, in this paper, we propose a new approach to effectively detecting performance issue beacons to help with performance issue diagnosis. Our approach includes techniques for mining system metric data to address limitations when applying previous classification-based approaches. Our evaluations on both a controlled environment and a real production environment show that our approach can more effectively identify performance issue beacons from system metric data than previous approaches.
What problem does this paper attempt to address?