Language Models Still Struggle to Zero-shot Reason about Time Series

Mike A. Merrill,Mingtian Tan,Vinayak Gupta,Tom Hartvigsen,Tim Althoff
2024-04-18
Abstract:Time series are critical for decision-making in fields like finance and healthcare. Their importance has driven a recent influx of works passing time series into language models, leading to non-trivial forecasting on some datasets. But it remains unknown whether non-trivial forecasting implies that language models can reason about time series. To address this gap, we generate a first-of-its-kind evaluation framework for time series reasoning, including formal tasks and a corresponding dataset of multi-scale time series paired with text captions across ten domains. Using these data, we probe whether language models achieve three forms of reasoning: (1) Etiological Reasoning - given an input time series, can the language model identify the scenario that most likely created it? (2) Question Answering - can a language model answer factual questions about time series? (3) Context-Aided Forecasting - does highly relevant textual context improve a language model's time series forecasts?
Computation and Language
What problem does this paper attempt to address?
This paper discusses the limitations of language models in time series reasoning. Although recent research has shown that language models can be used for time series tasks, especially prediction, it is still unknown whether they can truly understand and reason time series. The paper proposes a novel evaluation framework that includes three time series reasoning tasks: causal inference, question answering, and context-assisted prediction. Through these tasks, the study finds that current language models perform poorly in time series reasoning, with scores only slightly higher than random in causal inference and question answering tasks, and significantly inferior to humans. Additionally, although contextual information can to some extent enhance prediction capability, the improvement is limited. These findings suggest that time series reasoning is an important and underdeveloped direction in language model research.