Expressing Multivariate Time Series as Graphs with Time Series Attention Transformer

William T. Ng,K. Siu,Albert C. Cheung,Michael K. Ng
DOI: https://doi.org/10.48550/arXiv.2208.09300
2022-08-19
Abstract:A reliable and efficient representation of multivariate time series is crucial in various downstream machine learning tasks. In multivariate time series forecasting, each variable depends on its historical values and there are inter-dependencies among variables as well. Models have to be designed to capture both intra- and inter-relationships among the time series. To move towards this goal, we propose the Time Series Attention Transformer (TSAT) for multivariate time series representation learning. Using TSAT, we represent both temporal information and inter-dependencies of multivariate time series in terms of edge-enhanced dynamic graphs. The intra-series correlations are represented by nodes in a dynamic graph; a self-attention mechanism is modified to capture the inter-series correlations by using the super-empirical mode decomposition (SMD) module. We applied the embedded dynamic graphs to times series forecasting problems, including two real-world datasets and two benchmark datasets. Extensive experiments show that TSAT clearly outerperforms six state-of-the-art baseline methods in various forecasting horizons. We further visualize the embedded dynamic graphs to illustrate the graph representation power of TSAT. We share our code at <a class="link-external link-https" href="https://github.com/RadiantResearch/TSAT" rel="external noopener nofollow">this https URL</a>.
Machine Learning,Artificial Intelligence,Dynamical Systems,Representation Theory
What problem does this paper attempt to address?
This paper attempts to solve the key problems in multivariate time series (MTS) representation learning and prediction. Specifically, it aims to provide a reliable and efficient method to represent multivariate time series data, so as to achieve better performance in various downstream machine - learning tasks. One of the core challenges in multivariate time series prediction is that each variable depends not only on its historical values, but also on the interdependencies among variables. Therefore, the model needs to be able to capture the relationships within the time series (intra - series) and between time series (inter - series). To achieve this goal, the authors propose the Time Series Attention Transformer (TSAT). The main contributions of TSAT include: 1. **Introduction of Edge - Enhanced Dynamic Graphs**: By abstracting the internal and external correlations of multivariate time series into topological graphs, where nodes represent individual time series and edges represent the correlations between time series. This representation method enables graph neural networks (GNN) to be intuitively applied to the learning of dynamic graphs. 2. **Improvement of the Self - Attention Mechanism**: By combining the Super - Empirical Mode Decomposition (SMD) module, the self - attention mechanism is modified to better capture the graph embeddings in time series prediction problems. Specifically, formula (9) represents a modified version of the multi - head self - attention operation: \[ A_i=\left(\alpha_0 \sigma\left(\frac{Q_i K_i^T}{\sqrt{d_k}}\right)+\sum_{k = 1}^{K} \alpha_k \sigma(\text{Dim}f_k)+\alpha_{K + 1} A\right) V_i \] where $\alpha_i$ are trainable parameters, corresponding to the information of node features, edge features and adjacency matrices respectively. 3. **Provision of extensive experimental verification**: Experiments were carried out on multiple real - world datasets and benchmark datasets, and the results show that TSAT outperforms six state - of - the - art baseline methods in different prediction ranges. In addition, ablation studies prove the importance of the dynamic graph structure and edge features in improving prediction accuracy. In conclusion, this paper solves the key problems in multivariate time series representation and prediction by introducing the TSAT framework, especially showing excellent performance in capturing the internal and external relationships of time series.