Fine-Tuning a Time Series Foundation Model with Wasserstein Loss

Andrei Chernov

2024-09-19

Abstract:Inspired by recent advancements in large language models (LLMs) for Natural Language Processing (NLP), there has been a surge in research focused on developing foundational models for time series forecasting. One approach involves training LLM architectures on tokenized time series data using cross-entropy loss. Although this method has demonstrated promising results, cross-entropy loss is primarily designed for classification tasks and does not account for the distance between classes. To address this limitation, we propose using the Wasserstein loss for such architectures. To validate our approach, we fine-tuned a foundational time series model on $22$ zero-shot datasets, comparing the performance of cross-entropy loss with that of Wasserstein loss. Our results demonstrate that replacing cross-entropy loss with Wasserstein loss significantly improves point estimation.

Machine Learning,Artificial Intelligence,Computation and Language

What problem does this paper attempt to address?

The paper attempts to address the issue of how to improve the performance of time series models based on large language model (LLM) architectures using Wasserstein loss in time series forecasting. Specifically, the paper points out that while traditional cross-entropy loss is effective for classification tasks, it has limitations in regression tasks like time series forecasting because it ignores the distance information between categories. To solve this problem, the authors propose using Wasserstein loss to replace cross-entropy loss to improve the point estimation accuracy in time series forecasting. Through fine-tuning experiments on multiple zero-shot datasets, the significant advantage of Wasserstein loss over cross-entropy loss in point estimation was validated. However, this method performs slightly worse than cross-entropy loss in probabilistic forecasting. Future research directions include training time series base models from scratch using Wasserstein loss and exploring more complex distribution assumptions to improve probabilistic forecasting performance.

Fine-Tuning a Time Series Foundation Model with Wasserstein Loss

Effective LSTMs with Seasonal-Trend Decomposition and Adaptive Learning and Niching-Based Backtracking Search Algorithm for Time Series Forecasting

Enhancing Foundation Models for Time Series Forecasting via Wavelet-based Tokenization

Beyond Accuracy Optimization: Computer Vision Losses for Large Language Model Fine-Tuning

Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities

In-Context Fine-Tuning for Time-Series Foundation Models

Fine-Tuning Large Language Models for Stock Return Prediction Using Newsflow

Are Language Models Actually Useful for Time Series Forecasting?

An Evaluation of Standard Statistical Models and LLMs on Time Series Forecasting

AutoTimes: Autoregressive Time Series Forecasters via Large Language Models

Time-Series Foundation Model for Value-at-Risk

How Much Can Time-related Features Enhance Time Series Forecasting?

Wasserstein distance loss function for financial time series deep learning

Large Language Models Are Zero-Shot Time Series Forecasters

Chain of LoRA: Efficient Fine-tuning of Language Models via Residual Learning

A decoder-only foundation model for time-series forecasting

Hierarchical Multimodal LLMs with Semantic Space Alignment for Enhanced Time Series Classification

Revisited Large Language Model for Time Series Analysis through Modality Alignment

Uncertainty quantification in fine-tuned LLMs using LoRA ensembles

LLM4TS: Aligning Pre-Trained LLMs as Data-Efficient Time-Series Forecasters