Abstract:Large Language Models (LLMs) offer the potential for automatic time series analysis and reporting, which is a critical task across many domains, spanning healthcare, finance, climate, energy, and many more. In this paper, we propose a framework for rigorously evaluating the capabilities of LLMs on time series understanding, encompassing both univariate and multivariate forms. We introduce a comprehensive taxonomy of time series features, a critical framework that delineates various characteristics inherent in time series data. Leveraging this taxonomy, we have systematically designed and synthesized a diverse dataset of time series, embodying the different outlined features, each accompanied by textual descriptions. This dataset acts as a solid foundation for assessing the proficiency of LLMs in comprehending time series. Our experiments shed light on the strengths and limitations of state-of-the-art LLMs in time series understanding, revealing which features these models readily comprehend effectively and where they falter. In addition, we uncover the sensitivity of LLMs to factors including the formatting of the data, the position of points queried within a series and the overall time series length.

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve This paper aims to fill a gap in current research by systematically evaluating the capabilities of general large language models (LLMs) in understanding time series. Although recent studies have explored the application of LLMs to specific time series tasks, such as epilepsy localization in electroencephalograms (EEG), cardiovascular disease diagnosis in electrocardiograms (ECG), understanding weather and climate data, and interpretable financial time series forecasting, there is still a lack of systematic evaluation of the fundamental capabilities of general LLMs in understanding time series. Specifically, the goals of the paper include: 1. **Proposing a comprehensive taxonomy of time series features**: This taxonomy covers various features of univariate and multivariate time series, providing a structured framework for evaluating the capabilities of LLMs in understanding time series. 2. **Constructing a diverse synthetic time series dataset**: This dataset includes various time series features and is accompanied by qualitative and quantitative textual descriptions, providing a solid foundation for evaluating the performance of LLMs. 3. **Systematically evaluating the performance of LLMs in understanding time series**: Through a series of experiments, the paper reveals the strengths and limitations of LLMs in tasks such as time series feature detection, classification, information retrieval, and arithmetic reasoning, particularly their performance in handling factors such as data format, query location, and time series length. Through these goals, the paper hopes to provide valuable insights for the development of automated time series annotation and summarization tools based on LLMs, thereby enhancing data analysis and reporting workflows in various fields.

Evaluating Large Language Models on Time Series Feature Understanding: A Comprehensive Taxonomy and Benchmark

Large Language Models for Time Series: A Survey

Empowering Time Series Analysis with Large Language Models: A Survey

Revisited Large Language Model for Time Series Analysis through Modality Alignment

Position: What Can Large Language Models Tell Us about Time Series Analysis

An Evaluation of Standard Statistical Models and LLMs on Time Series Forecasting

A Survey on Evaluation of Large Language Models

An Interdisciplinary Outlook on Large Language Models for Scientific Research

A Survey on Evaluation of Large Language ModelsJust Accepted

Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding -- A Survey

Large Language Models in Healthcare: A Comprehensive Benchmark

Survey of different Large Language Model Architectures: Trends, Benchmarks, and Challenges

LLMEval: A Preliminary Study on How to Evaluate Large Language Models

Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities

A Review of Current Trends, Techniques, and Challenges in Large Language Models (LLMs)

Large Language Models for Forecasting and Anomaly Detection: A Systematic Literature Review

Evaluating Large Language Models: A Comprehensive Survey

An analysis of large language models: their impact and potential applications

A Comprehensive Evaluation of Large Language Models on Temporal Event Forecasting

TimeSeriesExam: A time series understanding exam