AutoKAD: Empowering KPI Anomaly Detection with Label-Free Deployment

Zhaoyang Yu,Changhua Pei,Shenglin Zhang,Xidao Wen,Jianhui Li,Gaogang Xie,Dan Pei
DOI: https://doi.org/10.1109/issre59848.2023.00063
2023-01-01
Abstract:Monitoring Key Performance Indicators (KPIs) and detecting anomalies in online service systems is critical. However, choosing the right KPI anomaly detection algorithm and appropriate hyperparameters presents a challenge. Conventional Automated Machine Learning (AutoML) struggles to address this because the hold-out dataset lacks labels and its loss doesn’t reliably reflect anomaly detection accuracy. To address the above challenges, this paper introduces AutoKAD, an AutoML framework designed to solve the combined algorithm selection and hyperparameter optimization problem for unsupervised KPI Anomaly Detection. We propose a label-free universal objective function, inspired by the Local Outlier Factor (LOF), for evaluating AutoML trials. Additionally, we improve the acquisition function and designs a cluster-based warm start strategy to enhance exploration effectiveness and efficiency. The experimental results on three real-world datasets show that our approach outperforms the SOTA model selection algorithm by 11% in F1-score and achieves comparable performance (99%) with theoretically optimal results. We believe that AutoKAD can greatly improve the deployment feasibility of existing anomaly detection algorithms in real-world systems. Our code is anonymously released at https://github.com/NetManAIOps/AutoKAD.
What problem does this paper attempt to address?