Probabilistic Temporal Fusion Transformers for Large-Scale KPI Anomaly Detection

Haoran Luo,Yongkun Zheng,Kang Chen,Shuo Zhao
DOI: https://doi.org/10.1109/access.2024.3353201
IF: 3.9
2024-01-01
IEEE Access
Abstract:This paper introduces a new generic and scalable framework for large-scale time series prediction and unsupervised anomaly detection. The most common approach of state-of-the-art time series anomaly detection techniques, which are mostly based on neural networks, is to train a network per time series. However, a typical modern microservice system consists of hundreds of active nodes/instances. To monitor the performance of such a system, we often need to keep track of thousands of time series describing different aspects of the system, including CPU usage, call latency, and workloads. We introduce a new methodology for grouping metrics that share the same type, predicting hundreds of metrics concurrently with a single neural network model with shared parameters. The model also integrates the probabilistic representations and Temporal Fusion Transformers for better performance. In a real-world dataset, our proposed model achieved up to 50% improvement in terms of MSE.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?