Abstract:Deep learning for time series forecasting has traditionally operated within a one-model-per-dataset framework, limiting its potential to leverage the game-changing impact of large pre-trained models. The concept of universal forecasting, emerging from pre-training on a vast collection of time series datasets, envisions a single Large Time Series Model capable of addressing diverse downstream forecasting tasks. However, constructing such a model poses unique challenges specific to time series data: i) cross-frequency learning, ii) accommodating an arbitrary number of variates for multivariate time series, and iii) addressing the varying distributional properties inherent in large-scale data. To address these challenges, we present novel enhancements to the conventional time series Transformer architecture, resulting in our proposed Masked Encoder-based Universal Time Series Forecasting Transformer (Moirai). Trained on our newly introduced Large-scale Open Time Series Archive (LOTSA) featuring over 27B observations across nine domains, Moirai achieves competitive or superior performance as a zero-shot forecaster when compared to full-shot models. Code, data, and model weights can be found at

What problem does this paper attempt to address?

The paper aims to address a key challenge in the field of time series forecasting: how to construct a universal time series forecasting model that can make effective predictions on a variety of datasets without specialized training. To tackle this issue, the research team has made the following main contributions: 1. **Proposing a novel Transformer architecture**: To address the problem where each dataset in traditional methods is only suitable for one model, the authors designed a new Transformer architecture—Masked Encoder-based Universal Time Series Forecasting Transformer (MOIRAI). This architecture is capable of handling multivariate time series data with different frequencies, arbitrary dimensions, and varying distribution characteristics. - **Cross-frequency learning**: By learning multiple input and output projection layers to handle time series data of different frequencies. - **Arbitrary dimension forecasting**: Introduced the Any-variate Attention mechanism, which can flexibly handle any number of variables. - **Flexible forecasting distribution**: Adopted a mixed distribution to accommodate different types of data distribution characteristics. 2. **Large-scale open time series dataset LOTSA**: To train this universal model, the authors constructed a large-scale time series dataset named Large-scale Open Time Series Archive (LOTSA). This dataset contains over 27 billion observations, spanning nine different domains. 3. **Model evaluation**: The MOIRAI model was tested under different evaluation settings, including in-distribution forecasting and zero-shot forecasting scenarios. The results show that MOIRAI achieved competitive or even better performance in multiple benchmark tests, proving its effectiveness as a zero-shot predictor. In summary, this research aims to advance the field of time series forecasting by developing a universal prediction model that performs excellently in a variety of tasks, thereby overcoming the limitations of traditional methods in terms of generalization ability.

Unified Training of Universal Time Series Forecasting Transformers

TFEformer: Temporal Feature Enhanced Transformer for Multivariate Time Series Forecasting

Foreformer: an Enhanced Transformer-Based Framework for Multivariate Time Series Forecasting

Hidformer: Hierarchical Dual-Tower Transformer Using Multi-Scale Mergence for Long-Term Time Series Forecasting

Generalizable Memory-driven Transformer for Multivariate Long Sequence Time-series Forecasting

Mamba or Transformer for Time Series Forecasting? Mixture of Universals (MoU) Is All You Need

Enhancing Time Series Forecasting: A Hierarchical Transformer with Probabilistic Decomposition Representation

Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts

UniTime: A Language-Empowered Unified Model for Cross-Domain Time Series Forecasting

Moirai-MoE: Empowering Time Series Foundation Models with Sparse Mixture of Experts

Multi-scale Transformer Pyramid Networks for Multivariate Time Series Forecasting

Multi-resolution Time-Series Transformer for Long-term Forecasting

UniTST: Effectively Modeling Inter-Series and Intra-Series Dependencies for Multivariate Time Series Forecasting

TCLN: A Transformer-based Conv-LSTM Network for Multivariate Time Series Forecasting

Two Steps Forward and One Behind: Rethinking Time Series Forecasting with Deep Learning

UNITS: A Unified Multi-Task Time Series Model

Dateformer: Time-modeling Transformer for Longer-term Series Forecasting

Multi-scale convolution enhanced transformer for multivariate long-term time series forecasting

DAM: Towards A Foundation Model for Time Series Forecasting

A Time Series is Worth 64 Words: Long-term Forecasting with Transformers

Scalable Transformer for High Dimensional Multivariate Time Series Forecasting