Agent-based dynamic thresholding for adaptive anomaly detection using reinforcement learning
Xue Yang,Enda Howley,Michael Schukat
DOI: https://doi.org/10.1007/s00521-024-10536-0
2024-12-06
Neural Computing and Applications
Abstract:The complexity and scale of IT systems is increasing dramatically, posing many challenges to real-world anomaly detection. Over the years, there have been extensive studies toward deep learning-based methods focusing on feature learning and anomaly scoring, achieving tremendous success in this area. However, little work has been done on the thresholding problem despite it being a critical factor for detecting anomalies effectively. The commonly used static or expert-defined thresholds have shown a lack of adaptability to non-stationary and evolving time series. In this paper, we model thresholding in anomaly detection as a Markov decision process and propose an agent-based dynamic thresholding (ADT) framework based on a deep Q-network. First, an anomaly scorer such as an autoencoder is employed to obtain feature representations and produce anomaly scores for complex input data. Afterward, by analyzing anomaly scores and other useful environmental information, ADT can automatically provide appropriate binary thresholds, thereby achieving self-adaptive anomaly detection. Additionally, we introduce a rigorous mathematical approach to convert the binary thresholds into more fine-grained continuous thresholds that can adapt to different user requirements and practical situations. The properties of ADT are studied through comprehensive experiments on three real-world datasets and compared with baseline methods, hence demonstrating its thresholding capability, data-efficient learning, stability, and robustness, leading to significantly improved detection performance. Our research underscores the transformative role of reinforcement learning (RL) in providing adaptive anomaly detection, achieving remarkable results with minimal labels for training, and even in scenarios where labels are partially observable or contaminated with noise. To the best of our knowledge, we are the first to exhibit the application of RL for optimal thresholding control, for both binary and continuous thresholding scenarios, within the domain of time series anomaly detection.
computer science, artificial intelligence