Predicting Disk Failures with HMM- and HSMM-based Approaches

Ying Zhao,Xiang Liu,Siqing Gan,Weimin Zheng
DOI: https://doi.org/10.1007/978-3-642-14400-4_30
2010-01-01
Abstract:Understanding and predicting disk failures are essential for both disk vendors and users to manufacture more reliable disk drives and build more reliable storage systems, in order to avoid service downtime and possible data loss. Predicting disk failure from observable disk attributes, such as those provided by the Self-Monitoring and Reporting Technology (SMART) system, has been shown to be effective. In the paper, we treat SMART data as time series, and explore the prediction power by using HMM- and HSMM-based approaches. Our experimental results show that our prediction models outperform other models that do not capture the temporal relationship among attribute values over time. Using the best single attribute, our approach can achieve a detection rate of 46% at 0% false alarm. Combining the two best attributes, our approach can achieve a detection rate of 52% at 0% false alarm.
What problem does this paper attempt to address?