A Survey of Time Series Foundation Models: Generalizing Time Series Representation with Large Language Model

Jiexia Ye,Weiqi Zhang,Ke Yi,Yongzi Yu,Ziyue Li,Jia Li,Fugee Tsung
2024-05-07
Abstract:Time series data are ubiquitous across various domains, making time series analysis critically important. Traditional time series models are task-specific, featuring singular functionality and limited generalization capacity. Recently, large language foundation models have unveiled their remarkable capabilities for cross-task transferability, zero-shot/few-shot learning, and decision-making explainability. This success has sparked interest in the exploration of foundation models to solve multiple time series challenges simultaneously. There are two main research lines, namely pre-training foundation models from scratch for time series and adapting large language foundation models for time series. They both contribute to the development of a unified model that is highly generalizable, versatile, and comprehensible for time series analysis. This survey offers a 3E analytical framework for comprehensive examination of related research. Specifically, we examine existing works from three dimensions, namely Effectiveness, Efficiency and Explainability. In each dimension, we focus on discussing how related works devise tailored solution by considering unique challenges in the realm of time series. Furthermore, we provide a domain taxonomy to help followers keep up with the domain-specific advancements. In addition, we introduce extensive resources to facilitate the field's development, including datasets, open-source, time series libraries. A GitHub repository is also maintained for resource updates (
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
This paper attempts to address several key issues in time series analysis: 1. **Knowledge Transferability**: - Time series data exhibit significant differences across different domains or within the same domain over time, making it difficult to transfer time series representations learned from one specific task to other tasks. For example, stock market models are influenced by highly volatile factors such as economic indicators and investor sentiment, while climate models focus on long-term patterns and seasonal cycles governed by physical laws rather than human behavior. Therefore, knowledge transferability across different domains is a challenge. 2. **Data Sparseness**: - In many traditional time series scenarios, data is typically collected on a daily, monthly, or yearly basis, resulting in sparse datasets. Additionally, data acquisition and annotation may be subject to privacy constraints, such as clinical diagnosis required for ECG classification, where data acquisition is costly and restricted by patient privacy protection. This data scarcity hinders the effective training of deep learning models. 3. **Multimodal Learning**: - In multimodal time series analysis, leveraging complementary information from different modalities can enhance the interpretability and performance of models. For example, in stock trend prediction, news and comments on social media can directly influence trading activities, and integrating this information into the model can improve prediction accuracy. However, aligning multimodal data collected at different frequencies or time intervals to accurately reflect the temporal relationships between different modalities is challenging. Additionally, different modalities may require different techniques to effectively capture information, and seamlessly integrating this information into a coherent model is complex. 4. **Explainability**: - Providing detailed explanations of how a model generates predictions or identifies patterns can significantly enhance the practicality and acceptance of time series analysis. For instance, if a utility company uses an energy demand forecasting model to plan power production and pricing, it needs to demonstrate to regulators and consumers that these decisions are based on reasonable and understandable factors. However, most existing time series models are black-box models, lacking explanations for model behavior or predictions. To address these challenges, the paper explores two main research directions: - **Pre-training Foundation Models from Scratch for Time Series**: This approach aims to pre-train a foundation model on large-scale time series data, enabling it to generalize to various time series tasks. - **Adapting Large Language Foundation Models for Time Series**: This approach leverages the cross-task generalization ability, zero-shot/few-shot learning capability, and reasoning ability of large language models (LLMs) to address knowledge transfer, data scarcity, and explainability issues in time series analysis. The goal of the paper is to develop a unified model with strong generalization ability, versatility, and explainability to address multiple time series challenges through these two research directions.