Abstract:Recently, there has been a growing interest in leveraging pre-trained large language models (LLMs) for various time series applications. However, the semantic space of LLMs, established through the pre-training, is still underexplored and may help yield more distinctive and informative representations to facilitate time series forecasting. To this end, we propose Semantic Space Informed Prompt learning with LLM ($S^2$IP-LLM) to align the pre-trained semantic space with time series embeddings space and perform time series forecasting based on learned prompts from the joint space. We first design a tokenization module tailored for cross-modality alignment, which explicitly concatenates patches of decomposed time series components to create embeddings that effectively encode the temporal dynamics. Next, we leverage the pre-trained word token embeddings to derive semantic anchors and align selected anchors with time series embeddings by maximizing the cosine similarity in the joint space. This way, $S^2$IP-LLM can retrieve relevant semantic anchors as prompts to provide strong indicators (context) for time series that exhibit different temporal dynamics. With thorough empirical studies on multiple benchmark datasets, we demonstrate that the proposed $S^2$IP-LLM can achieve superior forecasting performance over state-of-the-art baselines. Furthermore, our ablation studies and visualizations verify the necessity of prompt learning informed by semantic space.

What problem does this paper attempt to address?

This paper aims to address key issues in time series forecasting, particularly leveraging pre-trained large language models (LLMs) for effective time series prediction. Specifically, the researchers propose a novel approach called "Semantic Space-based Prompt Learning with LLM (S2IP-LLM)" to overcome some challenges in existing methods. ### Main Issues 1. **Exploring the Semantic Space of Pre-trained Models**: While pre-trained large language models have achieved great success in natural language processing tasks and shown potential in complex or structured domains, their semantic space has not been fully explored. This could help generate more distinctive and informative time series representations. 2. **Diversity and Non-stationary Nature of Time Series Data**: Time series data can come from various domains such as healthcare, finance, transportation, etc. These data often have diverse formats and non-stationary characteristics, adding complexity to model training. ### Solutions - **Designing a Specialized Tokenization Module**: This module decomposes the time series into trend, seasonal, and residual components and creates embeddings by concatenating segments of these components to more effectively encode temporal dynamics. - **Utilizing Semantic Anchors**: Extract semantic anchors from pre-trained word embeddings and align them with time series embeddings to learn more distinctive and informative representations in a joint space. The selected semantic anchors are used as prompts to enhance the representation capability of time series embeddings under different temporal dynamics. - **Experimental Validation**: Extensive empirical studies on multiple benchmark datasets demonstrate the superior performance of the proposed S2IP-LLM in time series forecasting tasks. ### Summary The main contribution of the paper is the proposal of a novel method—S2IP-LLM, which improves time series forecasting tasks by leveraging the semantic space of pre-trained language models. This approach not only enhances prediction performance but also validates the importance of semantic space-based prompt learning for time series analysis.

$\textbf{S}^2$IP-LLM: Semantic Space Informed Prompt Learning with LLM for Time Series Forecasting

Time-LLM: Time Series Forecasting by Reprogramming Large Language Models

LeMoLE: LLM-Enhanced Mixture of Linear Experts for Time Series Forecasting

Reprogramming Foundational Large Language Models(LLMs) for Enterprise Adoption for Spatio-Temporal Forecasting Applications: Unveiling a New Era in Copilot-Guided Cross-Modal Time Series Representation Learning

Hierarchical Multimodal LLMs with Semantic Space Alignment for Enhanced Time Series Classification

Empowering Time Series Analysis with Large Language Models: A Survey

Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities

Towards Time Series Reasoning with LLMs

Rethinking Time Series Forecasting with LLMs via Nearest Neighbor Contrastive Learning

Understanding the Role of Textual Prompts in LLM for Time Series Forecasting: an Adapter View

LLM4TS: Aligning Pre-Trained LLMs as Data-Efficient Time-Series Forecasters

Csi-LLM: A Novel Downlink Channel Prediction Method Aligned with LLM Pre-Training

StockTime: A Time Series Specialized Large Language Model Architecture for Stock Price Prediction

Taming Pre-trained LLMs for Generalised Time Series Forecasting via Cross-modal Knowledge Distillation

AutoTimes: Autoregressive Time Series Forecasters via Large Language Models

Temporal Data Meets LLM -- Explainable Financial Time Series Forecasting

Improve Temporal Awareness of LLMs for Sequential Recommendation

GPT4MTS: Prompt-based Large Language Model for Multimodal Time-series Forecasting

CMS-LSTM: Context Embedding and Multi-Scale Spatiotemporal Expression LSTM for Predictive Learning

D2LLM: Decomposed and Distilled Large Language Models for Semantic Search

TEST: Text Prototype Aligned Embedding to Activate LLM's Ability for Time Series