Abstract:Understanding time series is crucial for its application in real-world scenarios. Recently, large language models (LLMs) have been increasingly applied to time series tasks, leveraging their strong language capabilities to enhance various applications. However, research on multimodal LLMs (MLLMs) for time series understanding and reasoning remains limited, primarily due to the scarcity of high-quality datasets that align time series with textual information. This paper introduces ChatTS, a novel MLLM designed for time series analysis. ChatTS treats time series as a modality, similar to how vision MLLMs process images, enabling it to perform both understanding and reasoning with time series. To address the scarcity of training data, we propose an attribute-based method for generating synthetic time series with detailed attribute descriptions. We further introduce Time Series Evol-Instruct, a novel approach that generates diverse time series Q&As, enhancing the model's reasoning capabilities. To the best of our knowledge, ChatTS is the first MLLM that takes multivariate time series as input, which is fine-tuned exclusively on synthetic datasets. We evaluate its performance using benchmark datasets with real-world data, including six alignment tasks and four reasoning tasks. Our results show that ChatTS significantly outperforms existing vision-based MLLMs (e.g., GPT-4o) and text/agent-based LLMs, achieving a 46.0% improvement in alignment tasks and a 25.8% improvement in reasoning tasks.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is to improve the performance of large - language models (LLMs) in time - series understanding and reasoning tasks. Specifically, although existing multi - modal large - language models (MLLMs) have made significant progress in visual - language tasks, they still have limitations in the field of time - series analysis. These limitations are mainly reflected in the following aspects: 1. **Data scarcity**: High - quality datasets with time - series and text alignment are very scarce, which restricts the learning ability of the model. 2. **Time - series property description**: Time - series data contains rich shape and numerical properties, and precise and diverse text descriptions are required to achieve effective multi - modal alignment. 3. **Multivariate time - series processing**: Time - series data in the real world are usually multivariate and of unfixed length, and existing models have difficulty in effectively handling such complexity. 4. **Lack of evaluation data**: The lack of comprehensive evaluation data and methods makes it difficult to accurately evaluate the performance of the model on time - series tasks. To solve these problems, the paper proposes the following innovations: - **Attribute - based synthetic time - series generation method**: By describing the properties of time - series in detail, generate high - quality synthetic time - series and text data to overcome the problem of data scarcity. - **Time - Series Evol - Instruct (TSEvol)**: By dynamically introducing time - series properties, generate diverse question - and - answer datasets to enhance the reasoning ability of the model. - **Context - aware time - series multi - modal LLM (ChatTS)**: Design a model that can handle multivariate time - series inputs, adopt a context - aware time - series encoder, retain the original numerical information, and improve the model performance through multi - stage training. Through these methods, the paper aims to enable LLMs to better understand and reason about time - series data, so as to provide more accurate analysis and interpretation in practical applications.

ChatTS: Aligning Time Series with LLMs via Synthetic Data for Enhanced Understanding and Reasoning

Towards Time Series Reasoning with LLMs

A Picture is Worth A Thousand Numbers: Enabling LLMs Reason about Time Series via Visualization

TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning

TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding

TableTime: Reformulating Time Series Classification as Zero-Shot Table Understanding via Large Language Models

LLM4TS: Aligning Pre-Trained LLMs as Data-Efficient Time-Series Forecasters

Hierarchical Multimodal LLMs with Semantic Space Alignment for Enhanced Time Series Classification

Revisited Large Language Model for Time Series Analysis through Modality Alignment

Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities

LITA: Language Instructed Temporal-Localization Assistant

ChatABL: Abductive Learning via Natural Language Interaction with ChatGPT

Taming Pre-trained LLMs for Generalised Time Series Forecasting via Cross-modal Knowledge Distillation

ST-LLM: Large Language Models Are Effective Temporal Learners

VTimeLLM: Empower LLM to Grasp Video Moments

Time-LLM: Time Series Forecasting by Reprogramming Large Language Models

Position: What Can Large Language Models Tell Us about Time Series Analysis

Advancing Time Series Classification with Multimodal Language Modeling

TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data

TEST: Text Prototype Aligned Embedding to Activate LLM's Ability for Time Series