MEAformer: an All-Mlp Transformer with Temporal External Attention for Long-Term Time Series Forecasting

Siyuan Huang,Yepeng Liu,Haoyi Cui,Fan Zhang,Jinjiang Li,Xiaofeng Zhang,Mingli Zhang,Caiming Zhang
DOI: https://doi.org/10.1016/j.ins.2024.120605
IF: 8.1
2024-01-01
Information Sciences
Abstract:Transformer-based models have significantly improved performance in Long-term Time Series Forecasting (LTSF). These models employ various self-attention mechanisms to discover long-term dependencies. However, the computational efficiency is hampered by the inherent permutation invariance of self-attention, and they primarily focus on relationships within the sequence while neglecting potential relationships between different sample sequences. This limits the ability and flexibility of self-attention in LTSF. In addition, the Transformer's decoder outputs sequences in an autoregressive manner, leading to slow inference speed and error accumulation effects, especially for LTSF. Regarding the issues with Transformer-based models for LTSF, we propose a model better suited for LTSF, named MEAformer. MEAformer adopts a fully connected Multi-Layer Perceptron (MLP) architecture consisting of two types of layers: encoder layers and MLP layers. Unlike most encoder layers in Transformer-based models, the MEAformer replaces self-attention with temporal external attention. Temporal external attention explores potential relationships between different sample sequences in the training dataset. Compared to the quadratic complexity of self-attention mechanisms, temporal external attention has efficient linear complexity. Encoder layers can be stacked multiple times to capture time-dependent relationships at different scales. Furthermore, the MEAformer replaces the intricate decoder layers of the original model with more straightforward MLP layers. This modification aims to enhance inference speed and facilitate single-pass sequence generation, effectively mitigating the problem of error accumulation effects. Regarding long-term forecasting, MEAformer achieves state-of-the-art performance on six benchmark datasets, covering five real-world domains: energy, transportation, economy, weather, and disease. Code is available at: https://github.com/huangsiyuan924/MEAformer.
What problem does this paper attempt to address?