Pengyu Li,Zhijie Zhong,Tong Zhang,Zhiwen Yu,C.L. Philip Chen,Kaixiang Yang
Abstract:Time series anomaly detection (TSAD) has been a research hotspot in both academia and industry in recent years. Deep learning methods have become the mainstream research direction due to their excellent performance. However, new viewpoints have emerged in recent TSAD research. Deep learning is not required for TSAD due to limitations such as slow deep learning speed. The Broad Learning System (BLS) is a shallow network framework that benefits from its ease of optimization and speed. It has been shown to outperform machine learning approaches while remaining competitive with deep learning. Based on the current situation of TSAD, we propose the Contrastive Patch-based Broad Learning System (CPatchBLS). This is a new exploration of patching technique and BLS, providing a new perspective for TSAD. We construct Dual-PatchBLS as a base through patching and Simple Kernel Perturbation (SKP) and utilize contrastive learning to capture the differences between normal and abnormal data under different representations. To compensate for the temporal semantic loss caused by various patching, we propose CPatchBLS with model level integration, which takes advantage of BLS's fast feature to build model-level integration and improve model detection. Using five real-world series anomaly detection datasets, we confirmed the method's efficacy, outperforming previous deep learning and machine learning methods while retaining a high level of computing efficiency.
What problem does this paper attempt to address?
### What problems does this paper attempt to solve?
This paper aims to solve several key problems in time - series anomaly detection (TSAD):
1. **Limitations of deep - learning methods**:
- Although deep - learning models perform well in TSAD, they have disadvantages such as slow training speed and complex model structures.
- In industrial application scenarios, there is an urgent need for fast algorithms, and the speed of deep - learning methods is difficult to meet this need.
2. **Deficiencies of traditional machine - learning methods**:
- Traditional machine - learning methods such as LOF and Isolation Forest perform poorly in capturing temporal information and abnormal semantics in time series.
- These methods are difficult to handle complex temporal data, and their performance degrades on high - dimensional data.
3. **Whether it is necessary to use deep - learning architectures**:
- The research community has doubts about the necessity of deep - learning architectures in TSAD. Although deep learning performs well, its indispensability in TSAD has not been fully verified.
4. **Finding a more balanced solution**:
- The paper proposes a new perspective, that is, by combining the Broad Learning System (BLS) and patching technology, to provide a new method that can not only maintain the robustness and speed advantages of machine learning, but also show powerful representation capabilities similar to deep learning.
To solve the above problems, the author proposes the **Contrastive Patch - based Broad Learning System (CPatchBLS)**, which is a method based on a shallow - network framework, aiming to accelerate the training and testing speed and improve the accuracy of time - series anomaly detection at the same time. CPatchBLS enhances the extraction and representation ability of temporal information by introducing patching technology and contrastive learning, thus outperforming traditional deep - learning and machine - learning methods on multiple real - world data sets.
### Formula summary
- **Hidden state calculation**:
\[
Z_i=\phi(xW_i + \beta_i),\quad\forall i\in\{1,2,\dots,m\}
\]
where \(\phi(\cdot)\) is the activation function (such as ReLU or Sigmoid), \(x\) is the input data, \(W_i\) is the weight matrix, and \(\beta_i\) is the bias matrix.
- **Cascaded feature layer**:
\[
Z^k_i =
\begin{cases}
\phi(xW^1_i+\beta^1_i), & \text{if } k = 1 \\
\phi(Z^{k - 1}_iW^k_i+\beta^k_i), & \text{if } k\in\{2,\dots,q\}
\end{cases}
\]
- **Enhanced node**:
\[
H_j=\xi(ZW_j+\beta_j),\quad\forall j\in\{1,2,\dots,n\}
\]
- **Loss function**:
\[
L=\|Y - \hat{Y}\|^2_2+\lambda\|W_o\|^2_2
\]
- **Pseudo - inverse update output weight**:
\[
W_o=(A^{\top}A+\lambda I)^{-1}A^{\top}Y
\]
- **Anomaly score**:
\[
\text{Score}=\|Y - \hat{Y}\|
\]
- **Difference score**:
\[
\text{Score}_{\text{diff}}=\frac{1}{2}\text{KL}(\text{PatchBLS SKP}(X),