Healing Online Service Systems Via Mining Historical Issue Repositories

Rui Ding,Qiang Fu,Jian-Guang Lou,Qingwei Lin,Dongmei Zhang,Jiajun Shen,Tao Xie
DOI: https://doi.org/10.1109/dsn.2014.39
2012-01-01
Abstract:Online service systems have been increasingly popular and important nowadays. Reducing the MTTR (Mean Time to Restore) of a service remains one of the most important steps to assure the user-perceived availability of the service. To reduce the MTTR, a common practice is to restore the service by identifying and applying an appropriate healing action. In this paper, we present an automated mining-based approach for suggesting an appropriate healing action for a given new issue. Our approach suggests an appropriate healing action by adapting healing actions from the retrieved similar historical issues. We have applied our approach to a real-world and large-scale product online service. The studies on 243 real issues of the service show that our approach can effectively suggest appropriate healing actions (with 87% accuracy) to reduce the MTTR of the service. In addition, according to issue characteristics, we further study and categorize issues where automatic healing suggestion faces difficulties.
What problem does this paper attempt to address?