Online Model-based Anomaly Detection in Multivariate Time Series: Taxonomy, Survey, Research Challenges and Future Directions

Lucas Correia,Jan-Christoph Goos,Philipp Klein,Thomas Bäck,Anna V. Kononova
DOI: https://doi.org/10.1016/j.engappai.2024.109323
2024-09-19
Abstract:Time-series anomaly detection plays an important role in engineering processes, like development, manufacturing and other operations involving dynamic systems. These processes can greatly benefit from advances in the field, as state-of-the-art approaches may aid in cases involving, for example, highly dimensional data. To provide the reader with understanding of the terminology, this survey introduces a novel taxonomy where a distinction between online and offline, and training and inference is made. Additionally, it presents the most popular data sets and evaluation metrics used in the literature, as well as a detailed analysis. Furthermore, this survey provides an extensive overview of the state-of-the-art model-based online semi- and unsupervised anomaly detection approaches for multivariate time-series data, categorising them into different model families and other properties. The biggest research challenge revolves around benchmarking, as currently there is no reliable way to compare different approaches against one another. This problem is two-fold: on the one hand, public data sets suffers from at least one fundamental flaw, while on the other hand, there is a lack of intuitive and representative evaluation metrics in the field. Moreover, the way most publications choose a detection threshold disregards real-world conditions, which hinders the application in the real world. To allow for tangible advances in the field, these issues must be addressed in future work.
Machine Learning,Artificial Intelligence,Systems and Control
What problem does this paper attempt to address?
The paper attempts to address several key issues in online anomaly detection for multivariate time series: 1. **Benchmarking Issue**: Currently, there is no reliable method to compare different anomaly detection methods. This is mainly due to at least one fundamental flaw in public datasets and the lack of intuitive and representative evaluation metrics. 2. **Detection Threshold Selection Issue**: Most literature ignores practical application conditions when selecting detection thresholds, which hinders the application of these methods in the real world. 3. **Online Training and Inference Issue**: Existing research often does not clearly distinguish between online training and online inference, which is crucial for practical applications. To promote substantial progress in this field, the paper proposes the following contributions: - **New Taxonomy**: Clearly distinguishes between online training and online inference in the field of online anomaly detection. - **Homogeneous Taxonomy for Continuous and Discrete Sequence Anomaly Detection Problems**. - **Detailed Overview and Analysis of the Most Popular Benchmark Datasets**. - **Detailed Overview and Analysis of the Proposed Evaluation Metrics**. - **Updated Overview of the Latest Methods for Model-Based Online Anomaly Detection**. Through these contributions, the paper aims to provide clearer and more practical guidance for researchers and practitioners to address the challenges in current online anomaly detection for multivariate time series.