Abstract:In recent decades, there has been substantial advances in time series models and benchmarks across various individual tasks, such as time series forecasting, classification, and anomaly detection. Meanwhile, compositional reasoning in time series is prevalent in real-world applications (e.g., decision-making and compositional question answering) and is in great demand. Unlike simple tasks that primarily focus on predictive accuracy, compositional reasoning emphasizes the synthesis of diverse information from both time series data and various domain knowledge, making it distinct and extremely more challenging. In this paper, we introduce Compositional Time Series Reasoning, a new task of handling intricate multistep reasoning tasks from time series data. Specifically, this new task focuses on various question instances requiring structural and compositional reasoning abilities on time series data, such as decision-making and compositional question answering. As an initial attempt to tackle this novel task, we developed TS-Reasoner, a program-aided approach that utilizes large language model (LLM) to decompose a complex task into steps of programs that leverage existing time series models and numerical subroutines. Unlike existing reasoning work which only calls off-the-shelf modules, TS-Reasoner allows for the creation of custom modules and provides greater flexibility to incorporate domain knowledge as well as user-specified constraints. We demonstrate the effectiveness of our method through a comprehensive set of experiments. These promising results indicate potential opportunities in the new task of time series reasoning and highlight the need for further research.

Language Models Still Struggle to Zero-shot Reason about Time Series

Implicit Reasoning in Deep Time Series Forecasting

A Picture is Worth A Thousand Numbers: Enabling LLMs Reason about Time Series via Visualization

Towards Time Series Reasoning with LLMs

Are Language Models Actually Useful for Time Series Forecasting?

Large Language Models Are Zero-Shot Time Series Forecasters

Large language models can be zero-shot anomaly detectors for time series?

Evaluating Large Language Models on Time Series Feature Understanding: A Comprehensive Taxonomy and Benchmark

TimeSeriesExam: A time series understanding exam

Position: What Can Large Language Models Tell Us about Time Series Analysis

Towards Benchmarking and Improving the Temporal Reasoning Capability of Large Language Models

Timo: Towards Better Temporal Reasoning for Language Models

TableTime: Reformulating Time Series Classification as Zero-Shot Table Understanding via Large Language Models

Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks

ChatTS: Aligning Time Series with LLMs via Synthetic Data for Enhanced Understanding and Reasoning

Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities

A Survey of Time Series Foundation Models: Generalizing Time Series Representation with Large Language Model

Can LLMs Understand Time Series Anomalies?

Reasoning and Tools for Human-Level Forecasting

Beyond Forecasting: Compositional Time Series Reasoning for End-to-End Task Execution