Large Language Models Are Zero-Shot Time Series Forecasters

Nate Gruver,Marc Finzi,Shikai Qiu,Andrew Gordon Wilson
2024-08-12
Abstract:By encoding time series as a string of numerical digits, we can frame time series forecasting as next-token prediction in text. Developing this approach, we find that large language models (LLMs) such as GPT-3 and LLaMA-2 can surprisingly zero-shot extrapolate time series at a level comparable to or exceeding the performance of purpose-built time series models trained on the downstream tasks. To facilitate this performance, we propose procedures for effectively tokenizing time series data and converting discrete distributions over tokens into highly flexible densities over continuous values. We argue the success of LLMs for time series stems from their ability to naturally represent multimodal distributions, in conjunction with biases for simplicity, and repetition, which align with the salient features in many time series, such as repeated seasonal trends. We also show how LLMs can naturally handle missing data without imputation through non-numerical text, accommodate textual side information, and answer questions to help explain predictions. While we find that increasing model size generally improves performance on time series, we show GPT-4 can perform worse than GPT-3 because of how it tokenizes numbers, and poor uncertainty calibration, which is likely the result of alignment interventions such as RLHF.
Machine Learning
What problem does this paper attempt to address?
The problem this paper attempts to address is how to leverage large language models (LLMs) to perform time series forecasting tasks in a zero-shot setting, i.e., without any fine-tuning on the target dataset. Specifically, the authors propose a method that encodes time series as numerical strings, transforming the time series forecasting problem into a next-token prediction problem in text. This allows the use of the powerful pre-training capabilities and probabilistic nature of large language models (such as likelihood evaluation and sampling). This method not only matches or exceeds the performance of models specifically designed for time series but also naturally handles missing data, integrates textual side information, and interprets prediction results. The key contributions of the paper include: 1. **Proposing a simple and effective method**: By encoding time series data as numerical strings and treating it as a next-token prediction problem in text, the authors demonstrate how to use large language models for time series forecasting. 2. **Addressing the unique challenges of time series data**: Time series data often comes from different sources, may contain missing values, and requires extrapolation from observed data, making accurate point predictions nearly impossible and uncertainty estimation particularly important. The authors overcome these challenges through carefully designed tokenization strategies and data rescaling methods. 3. **Advantages of zero-shot prediction**: This method does not require any fine-tuning on the target dataset, thus avoiding the need for specialized knowledge and extensive computational resources, and is also applicable in data-limited scenarios. 4. **Exploring the intrinsic capabilities of large language models**: The authors investigate how large language models express preferences for simple or repetitive sequences and find that these preferences align with prominent structures in time series (such as seasonality). Additionally, large language models can naturally handle missing data and express multimodal distributions, which is particularly useful for time series forecasting. Overall, this paper demonstrates the potential and advantages of large language models in the field of time series forecasting by transforming the time series forecasting problem into a text prediction problem.