Correlated Attention in Transformers for Multivariate Time Series

Quang Minh Nguyen,Lam M. Nguyen,Subhro Das
2023-11-21
Abstract:Multivariate time series (MTS) analysis prevails in real-world applications such as finance, climate science and healthcare. The various self-attention mechanisms, the backbone of the state-of-the-art Transformer-based models, efficiently discover the temporal dependencies, yet cannot well capture the intricate cross-correlation between different features of MTS data, which inherently stems from complex dynamical systems in practice. To this end, we propose a novel correlated attention mechanism, which not only efficiently captures feature-wise dependencies, but can also be seamlessly integrated within the encoder blocks of existing well-known Transformers to gain efficiency improvement. In particular, correlated attention operates across feature channels to compute cross-covariance matrices between queries and keys with different lag values, and selectively aggregate representations at the sub-series level. This architecture facilitates automated discovery and representation learning of not only instantaneous but also lagged cross-correlations, while inherently capturing time series auto-correlation. When combined with prevalent Transformer baselines, correlated attention mechanism constitutes a better alternative for encoder-only architectures, which are suitable for a wide range of tasks including imputation, anomaly detection and classification. Extensive experiments on the aforementioned tasks consistently underscore the advantages of correlated attention mechanism in enhancing base Transformer models, and demonstrate our state-of-the-art results in imputation, anomaly detection and classification.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The paper primarily aims to address issues in the analysis of Multivariate Time Series (MTS), particularly focusing on the limitations of existing Transformer-based models in capturing the complex cross-correlations between different features in MTS data. Specifically, the paper attempts to solve the following key problems: 1. **Improving the performance of Transformer models in non-predictive tasks**: Although existing Transformer-based models excel in capturing temporal dependencies, they have limitations in handling the cross-correlations between different features in MTS data. Therefore, the paper proposes a new correlation attention mechanism designed to efficiently capture the dependencies between features and seamlessly integrate into existing well-known Transformer models, thereby enhancing their performance in non-predictive tasks such as imputation, anomaly detection, and classification. 2. **Capturing lagged cross-correlations**: The paper emphasizes the importance of lagged cross-correlations in MTS data, a phenomenon where changes in one variable may only reflect in another variable after a certain delay. Despite the common occurrence of this phenomenon in practice, existing Transformer-based methods have not fully utilized this information to improve their performance in target applications. Hence, the proposed correlation attention mechanism is specifically designed to capture these lagged cross-correlations. 3. **Enhancing existing Transformer architectures**: By proposing a novel correlation attention mechanism, the paper aims to efficiently learn both immediate and lagged cross-correlations between different variables in MTS data. This mechanism can be seamlessly integrated with existing powerful Transformer models (such as Vanilla Transformer, Non-stationary Transformer, etc.) to enhance their performance. In summary, the core contribution of the paper lies in proposing a new correlation attention mechanism that not only effectively captures the cross-correlations between different features in MTS data but also significantly improves the performance of these models in various non-predictive tasks through integration with existing Transformer models.