CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Model Transfer

Ming Sun,Ya Su,Shenglin Zhang,Yuanpu Cao,Yuqing Liu,Dan Pei,Wenfei Wu,Yongsu Zhang,Xiaozhou Liu,Junliang Tang
DOI: https://doi.org/10.1109/infocom42981.2021.9488755
2021-01-01
Abstract:Anomaly detection is indispensable in modern IT infrastructure management. However, the dimension explosion problem of the monitoring data (large-scale machines, many key performance indicators, and frequent monitoring queries) causes a scalability issue to the existing algorithms. We propose a coarse-to-fine model transfer based framework CTF to achieve a scalable and accurate data-center-scale anomaly detection. CTF pre-trains a coarse-grained model, uses the model to extract and compress per-machine features to a distribution, clusters machines according to the distribution, and conducts model transfer to fine-tune per-cluster models for high accuracy. The framework takes advantage of clustering on the per-machine latent representation distribution, reusing the pre-trained model, and partial-layer model fine-tuning to boost the whole training efficiency. We also justify design choices such as the clustering algorithm and distance algorithm to achieve the best accuracy. We prototype CTF and experiment on production data to show its scalability and accuracy. We also release a labeling tool for multivariate time series and a labeled dataset to the research community.
What problem does this paper attempt to address?