ARF-Predictor: Effective Prediction of Aging-Related Failure Using Entropy
Pengfei Chen,Yong Qi,Xinyi Li,Di Hou,Michael Rung-Tsong Lyu
DOI: https://doi.org/10.1109/tdsc.2016.2604381
2016-01-01
IEEE Transactions on Dependable and Secure Computing
Abstract:Even well-designed software systems suffer from chronic performance degradation, also known as "software aging", due to internal (e.g., software bugs) or external (e.g., resource exhaustion) impairments. These chronic problems often fly under the radar of software monitoring systems before causing severe impacts (e.g., system failures). Therefore, it is a challenging issue how to timely predict the occurrence of failures caused by these problems. Unfortunately, the effectiveness of prior approaches are far from satisfactory due to the insufficiency of aging indicators adopted by them. To accurately predict failures caused by software aging which are named as Aging-Related Failure (ARFs), this paper presents a novel entropy-based aging indicator, namely Multidimensional Multi-scale Entropy (MMSE) which leverages the complexity embedded in runtime performance metrics to indicate software aging. To the best of our knowledge, this is the first time to leverage entropy to predict ARFs. Based upon MMSE, we implement three failure prediction approaches encapsulated in a proof-of-concept prototype named ARF-Predictor. The experimental evaluations in a Video on Demand (VoD) system, and in a real-world production system, AntVision, show that ARF-Predictor can predict ARFs with a very high accuracy and a low Ahead-Time-To-Failure (ATTF). Compared to previous approaches, ARF-Predictor improves the prediction accuracy by about 5 times and reduces ATTF even by 3 orders of magnitude. In addition, ARF-Predictor is light-weight enough to satisfy the real-time requirement.