Robust and Unsupervised KPI Anomaly Detection Based on Conditional Variational Autoencoder

Zeyan Li,Wenxiao Chen,Dan Pei
DOI: https://doi.org/10.1109/PCCC.2018.8710885
2018-01-01
Abstract:To ensure undisrupted web-based services, operators need to closely monitor various KPIs (Key Performance Indicator, such as CPU usages, network throughput, page views, number of online users, and etc), detect anomalies in them, and trigger timely troubleshooting or mitigation. There can be hundreds of thousands to even millions of KPIs to be monitored, thus operators need automatic anomaly detection approaches. However, neither traditional statistical approaches nor supervised ensemble approaches satisfy this requirement in practice when facing large number of KPIs. A state-of-art unsupervised approach Donut offering promising results, but it is not a sequential model thus cannot deal with the time information related anomalies. Thus, in this paper we propose Bagel, a robust and unsupervised anomaly detection algorithm for KPI that can handle time information related anomalies, using CVAE to incorporate time information and dropout layer to avoid overfitting. Our experiments using real data from Internet companies show that, compared to Donut, Bagel improves the anomaly detection best F1-score by 0.08 to 0.43.
What problem does this paper attempt to address?