SMARTformer: Semi-Autoregressive Transformer with Efficient Integrated Window Attention for Long Time Series Forecasting.

Yiduo Li,Shiyi Qi,Zhe Li,Zhongwen Rao,Lujia Pan,Zenglin Xu
DOI: https://doi.org/10.24963/ijcai.2023/241
2023-01-01
Abstract:The success of Transformers in long time series forecasting (LTSF) can be attributed to their attention mechanisms and non-autoregressive (NAR) decoder structures, which capture long-range dependencies. However, time series data also contain abundant local temporal dependencies, which are often overlooked in the literature and significantly hinder forecasting performance. To address this issue, we introduce SMARTformer, which stands for S e M i- A uto R egressive T ransformer. SMART-former utilizes the Integrated Window Attention (IWA) and Semi-AutoRegressive (SAR) Decoder to capture global and local dependencies from both encoder and decoder perspectives. IWA conducts local self-attention in multi-scale windows and global attention across windows with linear complexity to achieve complementary clues in local and enlarged receptive fields. SAR generates subsequences iteratively, similar to autoregressive (AR) decoding, but refines the entire sequence in a NAR manner. This way, SAR benefits from both the global horizon of NAR and the local detail capturing of AR.We also introduce the Time-Independent Embedding (TIE), which better captures local dependencies by avoiding entanglements of various periods that can occur when directly adding positional embedding to value embedding. Our extensive experiments on five benchmark datasets demonstrate the effectiveness of SMARTformer against state-of-the-art models, achieving an improvement of 10.2% and 18.4% in multivariate and univariate long-term forecasting, respectively.
What problem does this paper attempt to address?