Unraveling the "Anomaly" in Time Series Anomaly Detection: A Self-supervised Tri-domain Solution

Yuting Sun,Guansong Pang,Guanhua Ye,Tong Chen,Xia Hu,Hongzhi Yin
2023-11-27
Abstract:The ongoing challenges in time series anomaly detection (TSAD), notably the scarcity of anomaly labels and the variability in anomaly lengths and shapes, have led to the need for a more efficient solution. As limited anomaly labels hinder traditional supervised models in TSAD, various SOTA deep learning techniques, such as self-supervised learning, have been introduced to tackle this issue. However, they encounter difficulties handling variations in anomaly lengths and shapes, limiting their adaptability to diverse anomalies. Additionally, many benchmark datasets suffer from the problem of having explicit anomalies that even random functions can detect. This problem is exacerbated by ill-posed evaluation metrics, known as point adjustment (PA), which can result in inflated model performance. In this context, we propose a novel self-supervised learning based Tri-domain Anomaly Detector (TriAD), which addresses these challenges by modeling features across three data domains - temporal, frequency, and residual domains - without relying on anomaly labels. Unlike traditional contrastive learning methods, TriAD employs both inter-domain and intra-domain contrastive loss to learn common attributes among normal data and differentiate them from anomalies. Additionally, our approach can detect anomalies of varying lengths by integrating with a discord discovery algorithm. It is worth noting that this study is the first to reevaluate the deep learning potential in TSAD, utilizing both rigorously designed datasets (i.e., UCR Archive) and evaluation metrics (i.e., PA%K and affiliation). Through experimental results on the UCR dataset, TriAD achieves an impressive three-fold increase in PA%K based F1 scores over SOTA deep learning models, and 50% increase of accuracy as compared to SOTA discord discovery algorithms.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve several key challenges in time series anomaly detection (TSAD): 1. **Label scarcity**: - In TSAD tasks, the scarcity or unavailability of anomaly labels is a major challenge. Traditional supervised learning methods rely on a large amount of labeled data, but in practical applications, it is often very difficult to obtain these labels. - **Formula representation**: None 2. **Variations in anomaly length and shape**: - The length and shape of anomaly events are diverse, which makes it difficult for existing deep - learning methods to adapt to different types of anomalies. For example, point anomalies and continuous anomalies differ significantly in length and form. - **Formula representation**: None 3. **Problems with benchmark datasets and evaluation metrics**: - Many commonly used benchmark datasets (such as Yahoo, NASA, etc.) have problems such as label errors and unrealistic anomaly densities. In addition, commonly used evaluation metrics (such as point - adjusted PA) may overestimate model performance, leading to misleading results. - **Formula representation**: None 4. **Limitations of existing methods**: - Existing contrast - based learning methods, when dealing with time - series data, may misjudge the enhanced data as anomalies due to the introduction of inappropriate time - series augmentation techniques (such as jittering, scaling, etc.), affecting the detection accuracy. - **Formula representation**: None To solve the above problems, the author proposes a novel self - supervised Tri - Domain Anomaly Detector (TriAD), with the following main features: - **Three - domain feature extraction**: TriAD overcomes the limitations of single - domain feature representation by capturing features in the time domain, frequency domain, and residual domain. - **Self - supervised learning framework**: Adopting a self - supervised learning framework, it does not need to rely on anomaly labels and uses normal data for training. - **Cross - domain and intra - domain contrast losses**: Combining cross - domain and intra - domain contrast losses to ensure that the model can distinguish between normal and abnormal patterns. - **Outlier discovery algorithm integration**: Combining outlier discovery algorithms to improve the detection ability for anomalies of different lengths. Through these improvements, TriAD surpasses existing state - of - the - art methods in multiple aspects, including higher F1 scores, faster inference speeds, and better generalization abilities. ### Summary This paper solves the problems of label scarcity, anomaly diversity, and imperfect benchmark datasets and evaluation metrics in TSAD by proposing the TriAD framework, providing a more robust and efficient solution for time - series anomaly detection.