A Systematic Review for Transformer-based Long-term Series Forecasting

Liyilei Su,Xumin Zuo,Rui Li,Xin Wang,Heng Zhao,Bingding Huang
2023-10-31
Abstract:The emergence of deep learning has yielded noteworthy advancements in time series forecasting (TSF). Transformer architectures, in particular, have witnessed broad utilization and adoption in TSF tasks. Transformers have proven to be the most successful solution to extract the semantic correlations among the elements within a long sequence. Various variants have enabled transformer architecture to effectively handle long-term time series forecasting (LTSF) tasks. In this article, we first present a comprehensive overview of transformer architectures and their subsequent enhancements developed to address various LTSF tasks. Then, we summarize the publicly available LTSF datasets and relevant evaluation metrics. Furthermore, we provide valuable insights into the best practices and techniques for effectively training transformers in the context of time-series analysis. Lastly, we propose potential research directions in this rapidly evolving field.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The paper primarily focuses on the issue of Long-term Time Series Forecasting (LTSF). Specifically, the paper addresses the following aspects: 1. **Problem Background**: With the development of data collection technologies, time series forecasting tasks have gradually evolved to utilize more historical data to predict longer-term future trends. However, traditional statistical methods have limitations when dealing with these non-stationary time series data with complex nonlinear relationships. 2. **Limitations of Existing Solutions**: Although machine learning methods (such as Support Vector Machines and Adaptive Boosting algorithms) and deep learning methods (such as Recurrent Neural Networks (RNN) and their variants LSTM and GRU) have made progress, they still face issues such as gradient vanishing or explosion and limited ability to capture long-distance dependencies. 3. **Advantages and Applications of Transformer**: The paper points out that the Transformer architecture, especially its self-attention mechanism, is very suitable for handling long time series data because it can effectively capture long and short-term dependencies and has shown excellent performance in LTSF tasks. 4. **Research Objectives**: The paper aims to provide a comprehensive overview of the Transformer architecture and its applications in the LTSF field. Specifically, it includes: - Providing a comprehensive review of the Transformer architecture and its improved versions. - Summarizing publicly available LTSF datasets and related evaluation metrics. - Analyzing how to effectively train Transformers for time series analysis. - Exploring future research directions in this field. 5. **Paper Structure**: The paper first introduces the basic principles and components of the Transformer, including the self-attention mechanism, multi-head attention, encoder, and decoder. Then, it analyzes the key challenges of LTSF and discusses some Transformer-based LTSF architectures that have emerged in recent years, such as LogSparse transformer, Informer, Autoformer, etc. These architectures significantly reduce computational complexity and improve prediction accuracy by introducing various optimization measures (e.g., sparse attention mechanisms, sequence decomposition, etc.). In short, this paper aims to provide researchers with a comprehensive understanding framework by systematically reviewing and analyzing Transformer-based time series forecasting methods and pointing out directions for further research.