Abstract:Accurate long sequence time series forecasting (LSTF) remains a key challenge due to its complex time-dependent nature. Multivariate time series forecasting methods inherently assume that variables are interrelated and that the future state of each variable depends not only on its history but also on other variables. However, most existing methods, such as Transformer, cannot effectively exploit the potential spatial correlation between variables. To cope with the above problems, we propose a Transformer-based LSTF model, called Graphformer, which can efficiently learn complex temporal patterns and dependencies between multiple variables. First, in the encoder's self-attentive downsampling layer, Graphformer replaces the standard convolutional layer with an dilated convolutional layer to efficiently capture long-term dependencies between time series at different granularity levels. Meanwhile, Graphformer replaces the self-attention mechanism with a graph self-attention mechanism that can automatically infer the implicit sparse graph structure from the data, showing better generality for time series without explicit graph structure and learning implicit spatial dependencies between sequences. In addition, Graphformer uses a temporal inertia module to enhance the sensitivity of future time steps to recent inputs, and a multi-scale feature fusion operation to extract temporal correlations at different granularity levels by slicing and fusing feature maps to improve model accuracy and efficiency. Our proposed Graphformer can improve the long sequence time series forecasting accuracy significantly when compared with that of SOTA Transformer-based models.

What problem does this paper attempt to address?

The paper primarily aims to address the key challenges in Long Sequence Time Series Forecasting (LSTF), particularly the complex temporal dependencies and potential spatial correlations in multivariate time series data. The proposed method is named Graphformer, which is a model based on the Transformer architecture designed to effectively capture long-term dependencies and spatial dependencies in multivariate time series. Specifically, Graphformer addresses the problem through the following points: 1. **Improvement of Self-Attention Mechanism**: In the self-attention down-sampling layer of the encoder, Graphformer replaces the standard convolution layer with dilated causal convolution to efficiently capture long-term dependencies between time series at different granularity levels. 2. **Graph Self-Attention Mechanism**: Graphformer introduces a graph self-attention mechanism that can automatically infer implicit sparse graph structures from the data, which is particularly useful for time series without explicit graph structures, and can learn implicit spatial dependencies between sequences. 3. **Multi-Scale Feature Fusion**: To extract time dependencies at different scale levels, Graphformer designs a multi-scale feature fusion operation by slicing the feature map and fusing it through a multi-scale pyramid network to capture cross-scale feature information. 4. **Temporal Inertia Module**: Graphformer also uses a temporal inertia module to enhance the sensitivity of future time steps to recent inputs, which helps improve the accuracy of the model. 5. **Overall Architecture**: Graphformer follows an encoder-decoder structure, where the encoder stacks three multi-head sparse graph self-attention modules, two self-attention down-sampling modules, and a multi-scale feature fusion layer; the decoder includes a masked sparse graph self-attention module and a sparse graph self-attention module. Through these methods, Graphformer can significantly improve the accuracy of long sequence time series forecasting while maintaining low computational costs, especially when compared to existing Transformer-based models. Experimental results show that Graphformer outperforms the state-of-the-art on real-world datasets in multiple domains such as energy, transportation, and disease.

Graphformer: Adaptive graph correlation transformer for multivariate long sequence time series forecasting

Foreformer: an Enhanced Transformer-Based Framework for Multivariate Time Series Forecasting

TFEformer: Temporal Feature Enhanced Transformer for Multivariate Time Series Forecasting

Generalizable Memory-driven Transformer for Multivariate Long Sequence Time-series Forecasting

Hidformer: Hierarchical Dual-Tower Transformer Using Multi-Scale Mergence for Long-Term Time Series Forecasting

DyGraphformer: Transformer combining dynamic spatio-temporal graph network for multivariate time series forecasting

Multivariate long sequence time-series forecasting using dynamic graph learning

A hybrid framework for multivariate long-sequence time series forecasting

Resformer: Combine quadratic linear transformation with efficient sparse Transformer for long-term series forecasting

Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting

RSMformer: an efficient multiscale transformer-based framework for long sequence time-series forecasting

Multi-scale convolution enhanced transformer for multivariate long-term time series forecasting

Expanding the Prediction Capacity in Long Sequence Time-Series Forecasting

Knowledge-enhanced Transformer for Multivariate Long Sequence Time-series Forecasting

sTransformer: A Modular Approach for Extracting Inter-Sequential and Temporal Information for Time-Series Forecasting

Multi-resolution Time-Series Transformer for Long-term Forecasting

Towards Long-Term Time-Series Forecasting: Feature, Pattern, and Distribution

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Adaptive Graph Convolutional Network Framework for Multidimensional Time Series Prediction

Stecformer: Spatio-temporal Encoding Cascaded Transformer for Multivariate Long-term Time Series Forecasting

A Joint Time-Frequency Domain Transformer for multivariate time series forecasting