An Unsupervised Long- and Short-term Sparse Graph Neural Network for Multi-sensor Anomaly Detection

Qiucheng Miao,Dandan Wang,Chuanfu Xu,Jun Zhan,Chengkun Wu
DOI: https://doi.org/10.1109/jsen.2024.3383665
IF: 4.3
2024-01-01
IEEE Sensors Journal
Abstract:Anomaly detection of multivariate time series is critical in many applications. However, traditional statistical and machine learning models have limitations in modeling complex temporal dependencies and inter-sensor correlations. To address these limitations, graph neural networks (GNNs) have emerged as a powerful paradigm and have shown promising progress in anomaly detection. However, most existing GNN-based methods simplify sensor associations as fully connected graphs, contradicting real-world sparse connectivity. Moreover, while capturing intersensor dependencies, GNNs often overlook critical temporal dependencies in time series. To address these challenges, we propose an unsupervised long- and short-term sparse graph attention (LSGA) neural network. Specifically, we first use convolutional neural networks (CNNs) and skip-gate recurrent units (skip-GRUs) to extract local dependencies and long-term trends. Skip-GRU with time-skip connections effectively extends the span of information flow compared to traditional GRU. Due to the unknown graph structure between different sensors, we utilize node embedding to calculate the similarity between sensors and subsequently generate a dense similarity matrix. Then, we use the Gumbel-softmax sampling method to transform the similarity matrix into a sparse graph structure. To effectively fuse information from different sensors, we introduce a graph attention network (GAT), which can learn the relationships between sensors and dynamically fuse information based on the similarity of node embedding vectors. By means of sparse representation, we selectively focus on the information fusion of the sensors that have the greatest impact on themselves, thereby filtering out connections with low similarity between nodes and effectively removing redundant association information. Finally, we demonstrate with extensive experiments that our proposed method outperforms several state-of-the-art baseline methods in achieving better results on all four real datasets, improving average F1 by 0.97%, 7.7%, 1.92%, and 1.8%.
What problem does this paper attempt to address?