A Mamba Foundation Model for Time Series Forecasting

Haoyu Ma,Yushu Chen,Wenlai Zhao,Jinzhe Yang,Yingsheng Ji,Xinghua Xu,Xiaozhu Liu,Hao Jing,Shengzhuo Liu,Guangwen Yang

2024-11-05

Abstract:Time series foundation models have demonstrated strong performance in zero-shot learning, making them well-suited for predicting rapidly evolving patterns in real-world applications where relevant training data are scarce. However, most of these models rely on the Transformer architecture, which incurs quadratic complexity as input length increases. To address this, we introduce TSMamba, a linear-complexity foundation model for time series forecasting built on the Mamba architecture. The model captures temporal dependencies through both forward and backward Mamba encoders, achieving high prediction accuracy. To reduce reliance on large datasets and lower training costs, TSMamba employs a two-stage transfer learning process that leverages pretrained Mamba LLMs, allowing effective time series modeling with a moderate training set. In the first stage, the forward and backward backbones are optimized via patch-wise autoregressive prediction; in the second stage, the model trains a prediction head and refines other components for long-term forecasting. While the backbone assumes channel independence to manage varying channel numbers across datasets, a channel-wise compressed attention module is introduced to capture cross-channel dependencies during fine-tuning on specific multivariate datasets. Experiments show that TSMamba's zero-shot performance is comparable to state-of-the-art time series foundation models, despite using significantly less training data. It also achieves competitive or superior full-shot performance compared to task-specific prediction models. The code will be made publicly available.

Machine Learning,Artificial Intelligence

What problem does this paper attempt to address?

The problem this paper attempts to address is the challenge faced by existing time series forecasting models in handling rapidly changing patterns. Specifically: 1. **Data Scarcity**: Traditional supervised learning models require specific datasets for training, but in practical applications, newly emerging patterns may lack relevant data or be difficult to collect. 2. **Lack of Generalization**: These models typically perform well in specific domains or tasks but struggle to generalize across different domains or frequencies, leading to high and time-consuming adaptation costs from one domain to another. 3. **Low Data Efficiency**: When training data is limited, these models are prone to overfitting. To tackle these issues, the paper introduces TSMamba, a time series foundation model based on the Mamba architecture. The main features of TSMamba include: - **Linear Complexity**: By using the Mamba architecture, TSMamba achieves linear complexity, avoiding the quadratic complexity problem of traditional Transformer models. - **Two-Stage Transfer Learning**: Utilizing the large-scale pre-trained Mamba language model, the two-stage transfer learning process enables the model to effectively adapt to time series data while reducing dependence on large-scale datasets and lowering training costs. - **Multivariate Data Handling**: A compressed channel attention module is introduced to capture cross-channel dependencies in multivariate data, enhancing the model's performance on specific datasets. In summary, this paper aims to develop an efficient, highly generalizable, and data-efficient time series forecasting model to address the challenges of dynamic data changes and data scarcity in the real world.

A Mamba Foundation Model for Time Series Forecasting

Foreformer: an Enhanced Transformer-Based Framework for Multivariate Time Series Forecasting

Is Mamba Effective for Time Series Forecasting?

TFEformer: Temporal Feature Enhanced Transformer for Multivariate Time Series Forecasting

Bi-Mamba+: Bidirectional Mamba for Time Series Forecasting

UmambaTSF: A U-shaped Multi-Scale Long-Term Time Series Forecasting Method Using Mamba

Bi-Mamba4TS: Bidirectional Mamba for Time Series Forecasting

Integrating Mamba and Transformer for Long-Short Range Time Series Forecasting

FMamba: Mamba based on Fast-attention for Multivariate Time-series Forecasting

TimeMachine: A Time Series is Worth 4 Mambas for Long-term Forecasting

Mamba or Transformer for Time Series Forecasting? Mixture of Universals (MoU) Is All You Need

Integration of Mamba and Transformer -- MAT for Long-Short Range Time Series Forecasting with Application to Weather Dynamics

DTMamba : Dual Twin Mamba for Time Series Forecasting

CMamba: Channel Correlation Enhanced State Space Models for Multivariate Time Series Forecasting

Mamba4Cast: Efficient Zero-Shot Time Series Forecasting with State Space Models

Sequential Order-Robust Mamba for Time Series Forecasting

Simplified Mamba with Disentangled Dependency Encoding for Long-Term Time Series Forecasting

TCLN: A Transformer-based Conv-LSTM Network for Multivariate Time Series Forecasting

Generalizable Memory-driven Transformer for Multivariate Long Sequence Time-series Forecasting

SST: Multi-Scale Hybrid Mamba-Transformer Experts for Long-Short Range Time Series Forecasting

Test Time Learning for Time Series Forecasting