MetricSifter: Feature Reduction of Multivariate Time Series Data for Efficient Fault Localization in Cloud Applications

Yuuki Tsubouchi,Hirofumi Tsuruta
DOI: https://doi.org/10.1109/access.2024.3374334
IF: 3.9
2024-03-15
IEEE Access
Abstract:Automated fault localization in large-scale cloud-based applications is challenging because it involves mining multivariate time series data from large volumes of operational monitoring metrics. To improve localization accuracy, automated fault localization methods incorporate feature reduction to reduce the number of monitoring metrics unrelated to a failure. However, these methods have problems with inaccuracy, either from removing too many failure-related metrics or from retaining too few failure-unrelated metrics. In this paper, we present MetricSifter, a feature reduction framework designed to accurately identify anomalous metrics caused by faults. Our framework locates a failure time window with the highest density of fault-induced change point times across monitoring metrics with a focus on their temporal proximity. Experimental results indicate that MetricSifter achieves an accuracy of 0.981, which is significantly better than the selected baseline methods. Furthermore, experiments combining various reduction methods with various localization methods demonstrate that MetricSifter improves the recall and time efficiency over the baseline methods.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?