Abstract:In the contemporary digital landscape, the continuous generation of extensive streaming data across diverse domains has become pervasive. Yet, a significant portion of this data remains unlabeled, posing a challenge in identifying infrequent events such as anomalies. This challenge is further amplified in non-stationary environments, where the performance of models can degrade over time due to concept drift. To address these challenges, this paper introduces a new method referred to as VAE4AS (Variational Autoencoder for Anomalous Sequences). VAE4AS integrates incremental learning with dual drift detection mechanisms, employing both a statistical test and a distance-based test. The anomaly detection is facilitated by a Variational Autoencoder. To gauge the effectiveness of VAE4AS, a comprehensive experimental study is conducted using real-world and synthetic datasets characterized by anomalous rates below 10\% and recurrent drift. The results show that the proposed method surpasses both robust baselines and state-of-the-art techniques, providing compelling evidence for their efficacy in effectively addressing some of the challenges associated with anomalous sequence detection in non-stationary streaming data.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is the detection of abnormal sequences in non - stationary environments. Specifically, the researchers face the following challenges: 1. **A large amount of unlabeled streaming data**: In the contemporary digital environment, continuously generated streaming data widely exists in different fields, but a large part of this data is unlabeled, which makes it difficult to identify rare events (such as anomalies). 2. **Concept Drift**: In non - stationary environments, the performance of a model may decline over time because the data distribution has changed. This phenomenon is called concept drift, which will cause the model to gradually become ineffective. 3. **Distinguishing between abnormal sequences and concept drift**: Even in a supervised situation, correctly distinguishing between abnormal sequences and concept drift remains a key research challenge. In this paper, the authors pay special attention to this problem in an unsupervised environment. To solve these problems, the paper proposes a new method - VAE4AS (Variational Autoencoder for Anomalous Sequences), which combines incremental learning and a dual - concept - drift - detection mechanism to address the above challenges. Specifically, VAE4AS has the following characteristics: - **Variational Autoencoder (VAE)**: Used for anomaly detection. - **Incremental learning**: Able to adapt to changes in data over time without retraining the entire model. - **Dual - concept - drift - detection mechanism**: - **Statistical test**: Based on the Kolmogorov - Smirnov (KS) test, used to detect changes in the distribution of the latent layer. - **Distance test**: Based on the Euclidean distance, used to compare the differences between reference abnormal instances and classified abnormal instances. Through these techniques, VAE4AS can effectively detect abnormal sequences in non - stationary environments and does not need to rely on labeled data. Experimental results show that this method outperforms existing baselines and state - of - the - art methods on both real - world and synthetic datasets. ### Formula Summary 1. **KL Divergence**: \[ l_{\text{KL}}(x)=\text{KL}(q(z|x)\|N(0, I_k)) = \frac{1}{2}\sum_{i = 1}^k\left(\mu_i^2+\sigma_i^2-\log(\sigma_i^2)-1\right) \] 2. **Total Loss Function**: \[ l_{\text{VAE}}(x,\hat{x})=l_{\text{AE}}(x,\hat{x})+\beta\cdot l_{\text{KL}}(x) \] 3. **Anomaly Threshold**: \[ \theta_t=\text{mean}(L_t)+2\cdot\text{std}(L_t) \] 4. **Calculation of p - value in KS Test**: \[ p - value = 2\sum_{i = 1}^{\infty}(-1)^{i - 1}e^{-2i^2\gamma^2} \] where, \[ \gamma=\sqrt{\frac{N_{\text{eff}}+0.12}{0.11N_{\text{eff}}}} \] \[ KS_{\text{dis}}=\max|F(\text{reflatent}_i)-F(\text{mov latent}_i)| \] \[ N_{\text{eff}}=\frac{W_{\text{drift}}^2}{2W_{\text{drift}}} \] 5. **Euclidean Distance**: \[ DIS(\text{refdisx},\text{mov AN})=\sqrt{}

Unsupervised Incremental Learning with Dual Concept Drift Detection for Identifying Anomalous Sequences

Autoencoder-based Anomaly Detection in Streaming Data with Incremental Learning and Concept Drift Adaptation

Concept Drift Detection and Adaptation with Weak Supervision on Streaming Unlabeled Data

ADTCD: An Adaptive Anomaly Detection Approach Toward Concept Drift in IoT

Online Data Drift Detection for Anomaly Detection Services based on Deep Learning towards Multivariate Time Series

Unsupervised anomaly video detection via a double-flow ConvLSTM variational autoencoder

TeVAE: A Variational Autoencoder Approach for Discrete Online Anomaly Detection in Variable-state Multivariate Time-series Data

An ensemble learning method with GAN-based sampling and consistency check for anomaly detection of imbalanced data streams with concept drift

Unsupervised Unlearning of Concept Drift with Autoencoders

Anomaly Detection of Time Series With Smoothness-Inducing Sequential Variational Auto-Encoder

Drift-aware Anomaly Detection for Non-stationary Time Series

Revisiting VAE for Unsupervised Time Series Anomaly Detection: A Frequency Perspective

A novel framework for concept drift detection using autoencoders for classification problems in data streams

Online Semi-Supervised Concept Drift Detection with Density Estimation

Deep evolving semi-supervised anomaly detection

On the Reliable Detection of Concept Drift from Streaming Unlabeled Data

Residual spatiotemporal autoencoder for unsupervised video anomaly detection

Anomaly Detection for Time Series Using VAE-LSTM Hybrid Model

Fast Unsupervised Online Drift Detection Using Incremental Kolmogorov-Smirnov Test

Video anomaly detection based on a multi-layer reconstruction autoencoder with a variance attention strategy

VLAD: Task-agnostic VAE-based lifelong anomaly detection