Abstract:In data analysis and forecasting, particularly for multivariate long-term time series, challenges persist. The Transformer model in deep learning methods has shown significant potential in time series forecasting. The Transformer model's dot-product attention mechanism, however, due to its quadratic computational complexity, impairs training and forecasting efficiency. In addition, the Transformer architecture has limitations in modeling local features and dealing with multivariate cross-dimensional dependency relationship. In this article, a Multi-Scale Convolution Enhanced Transformer model (MSCformer) is proposed for multivariate long-term time series forecasting. As an alternative to modeling the time series in its entirety, a segmentation strategy is designed to convert the input original series into segmented forms with different lengths, then process time series segments using a new constructed multi-Dependency Aggregation module. This multi-Scale segmentation approach reduces the computational complexity of the attention mechanism part in subsequent models, and for each segment of length corresponds to a specific time scale, it also ensures that each segment retains the semantic information of the data sequence level, thereby comprehensively utilizing the multi-scale information of the data while more accurately capturing the real dependency of the time series. The Multi-Dependence Aggregate module captures both cross-temporal and cross-dimensional dependencies of multivariate long-term time series and compensates for local dependencies within the segments thereby captures local series features comprehensively and addressing the issue of insufficient information utilization. MSCformer synthesizes dependency information extracted from various temporal segments at different scales and reconstructs future series using linear layers. MSCformer exhibits higher forecasting accuracy, outperforming existing methods in multiple domains including energy, transportation, weather, electricity, disease and finance.

MEAformer: an All-Mlp Transformer with Temporal External Attention for Long-Term Time Series Forecasting

Foreformer: an Enhanced Transformer-Based Framework for Multivariate Time Series Forecasting

TFEformer: Temporal Feature Enhanced Transformer for Multivariate Time Series Forecasting

Hidformer: Hierarchical Dual-Tower Transformer Using Multi-Scale Mergence for Long-Term Time Series Forecasting

Bidformer: A Transformer-Based Model Via Bidirectional Sparse Self-Attention Mechanism for Long Sequence Time-Series Forecasting

Generalizable Memory-driven Transformer for Multivariate Long Sequence Time-series Forecasting

InParformer: Evolutionary Decomposition Transformers with Interactive Parallel Attention for Long-Term Time Series Forecasting

Muformer: A Long Sequence Time-Series Forecasting Model Based on Modified Multi-Head Attention

Periodformer: an Efficient Long-Term Time Series Forecasting Method Based on Periodic Attention

Detformer: Detect the Reliable Attention Index for Ultra-long Time Series Forecasting.

TFformer: A Time-Frequency Domain Bidirectional Sequence-Level Attention Based Transformer for Interpretable Long-Term Sequence Forecasting

AD-autoformer: decomposition transformers with attention distilling for long sequence time-series forecasting

RSMformer: an efficient multiscale transformer-based framework for long sequence time-series forecasting

Resformer: Combine quadratic linear transformation with efficient sparse Transformer for long-term series forecasting

SMARTformer: Semi-Autoregressive Transformer with Efficient Integrated Window Attention for Long Time Series Forecasting.

SDformer: Transformer with Spectral Filter and Dynamic Attention for Multivariate Time Series Long-term Forecasting

Multi-scale convolution enhanced transformer for multivariate long-term time series forecasting

PETformer: Long-term Time Series Forecasting via Placeholder-enhanced Transformer

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

sTransformer: A Modular Approach for Extracting Inter-Sequential and Temporal Information for Time-Series Forecasting

Does Long-Term Series Forecasting Need Complex Attention and Extra Long Inputs?