Abstract:The emergence of deep learning has yielded noteworthy advancements in time series forecasting (TSF). Transformer architectures, in particular, have witnessed broad utilization and adoption in TSF tasks. Transformers have proven to be the most successful solution to extract the semantic correlations among the elements within a long sequence. Various variants have enabled transformer architecture to effectively handle long-term time series forecasting (LTSF) tasks. In this article, we first present a comprehensive overview of transformer architectures and their subsequent enhancements developed to address various LTSF tasks. Then, we summarize the publicly available LTSF datasets and relevant evaluation metrics. Furthermore, we provide valuable insights into the best practices and techniques for effectively training transformers in the context of time-series analysis. Lastly, we propose potential research directions in this rapidly evolving field.

What problem does this paper attempt to address?

The paper primarily focuses on the issue of Long-term Time Series Forecasting (LTSF). Specifically, the paper addresses the following aspects: 1. **Problem Background**: With the development of data collection technologies, time series forecasting tasks have gradually evolved to utilize more historical data to predict longer-term future trends. However, traditional statistical methods have limitations when dealing with these non-stationary time series data with complex nonlinear relationships. 2. **Limitations of Existing Solutions**: Although machine learning methods (such as Support Vector Machines and Adaptive Boosting algorithms) and deep learning methods (such as Recurrent Neural Networks (RNN) and their variants LSTM and GRU) have made progress, they still face issues such as gradient vanishing or explosion and limited ability to capture long-distance dependencies. 3. **Advantages and Applications of Transformer**: The paper points out that the Transformer architecture, especially its self-attention mechanism, is very suitable for handling long time series data because it can effectively capture long and short-term dependencies and has shown excellent performance in LTSF tasks. 4. **Research Objectives**: The paper aims to provide a comprehensive overview of the Transformer architecture and its applications in the LTSF field. Specifically, it includes: - Providing a comprehensive review of the Transformer architecture and its improved versions. - Summarizing publicly available LTSF datasets and related evaluation metrics. - Analyzing how to effectively train Transformers for time series analysis. - Exploring future research directions in this field. 5. **Paper Structure**: The paper first introduces the basic principles and components of the Transformer, including the self-attention mechanism, multi-head attention, encoder, and decoder. Then, it analyzes the key challenges of LTSF and discusses some Transformer-based LTSF architectures that have emerged in recent years, such as LogSparse transformer, Informer, Autoformer, etc. These architectures significantly reduce computational complexity and improve prediction accuracy by introducing various optimization measures (e.g., sparse attention mechanisms, sequence decomposition, etc.). In short, this paper aims to provide researchers with a comprehensive understanding framework by systematically reviewing and analyzing Transformer-based time series forecasting methods and pointing out directions for further research.

A Systematic Review for Transformer-based Long-term Series Forecasting

Foreformer: an Enhanced Transformer-Based Framework for Multivariate Time Series Forecasting

TFEformer: Temporal Feature Enhanced Transformer for Multivariate Time Series Forecasting

Hidformer: Hierarchical Dual-Tower Transformer Using Multi-Scale Mergence for Long-Term Time Series Forecasting

Are Transformers Effective for Time Series Forecasting?

Deep Time Series Forecasting Models: A Comprehensive Survey

Generalizable Memory-driven Transformer for Multivariate Long Sequence Time-series Forecasting

sTransformer: A Modular Approach for Extracting Inter-Sequential and Temporal Information for Time-Series Forecasting

Towards Long-Term Time-Series Forecasting: Feature, Pattern, and Distribution

xLSTMTime : Long-term Time Series Forecasting With xLSTM

Enhancing Time Series Forecasting: A Hierarchical Transformer with Probabilistic Decomposition Representation

Two Steps Forward and One Behind: Rethinking Time Series Forecasting with Deep Learning

Time Series Forecasting (TSF) Using Various Deep Learning Models

Multi-resolution Time-Series Transformer for Long-term Forecasting

Itransformer: Inverted Transformers Are Effective for Time Series Forecasting

DRFormer: Multi-Scale Transformer Utilizing Diverse Receptive Fields for Long Time-Series Forecasting

Resformer: Combine quadratic linear transformation with efficient sparse Transformer for long-term series forecasting

Leveraging 2D Information for Long-term Time Series Forecasting with Vanilla Transformers

Enhanced Linear and Vision Transformer-Based Architectures for Time Series Forecasting

RSMformer: an efficient multiscale transformer-based framework for long sequence time-series forecasting