Unified Training of Universal Time Series Forecasting Transformers

Gerald Woo,Chenghao Liu,Akshat Kumar,Caiming Xiong,Silvio Savarese,Doyen Sahoo
2024-05-22
Abstract:Deep learning for time series forecasting has traditionally operated within a one-model-per-dataset framework, limiting its potential to leverage the game-changing impact of large pre-trained models. The concept of universal forecasting, emerging from pre-training on a vast collection of time series datasets, envisions a single Large Time Series Model capable of addressing diverse downstream forecasting tasks. However, constructing such a model poses unique challenges specific to time series data: i) cross-frequency learning, ii) accommodating an arbitrary number of variates for multivariate time series, and iii) addressing the varying distributional properties inherent in large-scale data. To address these challenges, we present novel enhancements to the conventional time series Transformer architecture, resulting in our proposed Masked Encoder-based Universal Time Series Forecasting Transformer (Moirai). Trained on our newly introduced Large-scale Open Time Series Archive (LOTSA) featuring over 27B observations across nine domains, Moirai achieves competitive or superior performance as a zero-shot forecaster when compared to full-shot models. Code, data, and model weights can be found at
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The paper aims to address a key challenge in the field of time series forecasting: how to construct a universal time series forecasting model that can make effective predictions on a variety of datasets without specialized training. To tackle this issue, the research team has made the following main contributions: 1. **Proposing a novel Transformer architecture**: To address the problem where each dataset in traditional methods is only suitable for one model, the authors designed a new Transformer architecture—Masked Encoder-based Universal Time Series Forecasting Transformer (MOIRAI). This architecture is capable of handling multivariate time series data with different frequencies, arbitrary dimensions, and varying distribution characteristics. - **Cross-frequency learning**: By learning multiple input and output projection layers to handle time series data of different frequencies. - **Arbitrary dimension forecasting**: Introduced the Any-variate Attention mechanism, which can flexibly handle any number of variables. - **Flexible forecasting distribution**: Adopted a mixed distribution to accommodate different types of data distribution characteristics. 2. **Large-scale open time series dataset LOTSA**: To train this universal model, the authors constructed a large-scale time series dataset named Large-scale Open Time Series Archive (LOTSA). This dataset contains over 27 billion observations, spanning nine different domains. 3. **Model evaluation**: The MOIRAI model was tested under different evaluation settings, including in-distribution forecasting and zero-shot forecasting scenarios. The results show that MOIRAI achieved competitive or even better performance in multiple benchmark tests, proving its effectiveness as a zero-shot predictor. In summary, this research aims to advance the field of time series forecasting by developing a universal prediction model that performs excellently in a variety of tasks, thereby overcoming the limitations of traditional methods in terms of generalization ability.