FMamba: Mamba based on Fast-attention for Multivariate Time-series Forecasting

Shusen Ma,Yu Kang,Peng Bai,Yun-Bo Zhao
2024-07-20
Abstract:In multivariate time-series forecasting (MTSF), extracting the temporal correlations of the input sequences is crucial. While popular Transformer-based predictive models can perform well, their quadratic computational complexity results in inefficiency and high overhead. The recently emerged Mamba, a selective state space model, has shown promising results in many fields due to its strong temporal feature extraction capabilities and linear computational complexity. However, due to the unilateral nature of Mamba, channel-independent predictive models based on Mamba cannot attend to the relationships among all variables in the manner of Transformer-based models. To address this issue, we combine fast-attention with Mamba to introduce a novel framework named FMamba for MTSF. Technically, we first extract the temporal features of the input variables through an embedding layer, then compute the dependencies among input variables via the fast-attention module. Subsequently, we use Mamba to selectively deal with the input features and further extract the temporal dependencies of the variables through the multi-layer perceptron block (MLP-block). Finally, FMamba obtains the predictive results through the projector, a linear layer. Experimental results on eight public datasets demonstrate that FMamba can achieve state-of-the-art performance while maintaining low computational overhead.
Machine Learning
What problem does this paper attempt to address?
The paper primarily addresses the problem of Multivariate Time-series Forecasting (MTSF) and aims to propose a new forecasting framework, FMamba, to resolve the conflict between computational efficiency and forecasting performance in existing methods. Specifically, the paper addresses the following key issues: 1. **Computational Efficiency Issue**: Existing Transformer-based time-series forecasting models, although effective in prediction, suffer from quadratic computational complexity due to their self-attention mechanism. This leads to a sharp increase in computational cost as the input sequence length increases. Therefore, researchers need to find a method that can maintain linear computational complexity to improve computational efficiency. 2. **Global Variable Correlation Issue**: The recently proposed Mamba model, while having the advantage of linear computational complexity, cannot effectively capture the interrelationships between global variables due to its unidirectional processing nature, unlike the Transformer. To solve the above issues, the researchers proposed the FMamba model, which combines the advantages of the Fast-attention mechanism and the Mamba model. The main contributions of FMamba include: - **Innovative Combination of Fast-attention Mechanism and Mamba**: By introducing the Fast-attention mechanism, FMamba can effectively capture the correlations between variables, while Mamba helps the model selectively focus on or ignore certain input information, enhancing the model's robustness. - **Linear Computational Complexity**: FMamba achieves computational complexity that is linearly related to the input sequence length, thereby significantly reducing computational overhead and improving computational efficiency. - **Outstanding Forecasting Performance**: Experimental results show that FMamba achieved State-of-the-Art (SOTA) performance on 8 public datasets, demonstrating its effectiveness in multivariate time-series forecasting tasks. In summary, the main goal of this paper is to develop an efficient and accurate multivariate time-series forecasting model to address the issues of low computational efficiency and insufficient forecasting performance in existing methods.