tsGT: Stochastic Time Series Modeling With Transformer

Łukasz Kuciński,Witold Drzewakowski,Mateusz Olko,Piotr Kozakowski,Łukasz Maziarka,Marta Emilia Nowakowska,Łukasz Kaiser,Piotr Miłoś
2024-04-04
Abstract:Time series methods are of fundamental importance in virtually any field of science that deals with temporally structured data. Recently, there has been a surge of deterministic transformer models with time series-specific architectural biases. In this paper, we go in a different direction by introducing tsGT, a stochastic time series model built on a general-purpose transformer architecture. We focus on using a well-known and theoretically justified rolling window backtesting and evaluation protocol. We show that tsGT outperforms the state-of-the-art models on MAD and RMSE, and surpasses its stochastic peers on QL and CRPS, on four commonly used datasets. We complement these results with a detailed analysis of tsGT's ability to model the data distribution and predict marginal quantile values.
Computer Science
What problem does this paper attempt to address?
The paper aims to address key issues in time series forecasting, particularly by introducing a new method—tsGT (a Transformer-based time series generation model) to tackle the inherent randomness of time series data. Below is an overview of the main problems the paper attempts to solve: 1. **Challenges in Time Series Modeling**: - Time series data is widely used in fields such as medicine, finance, and economics, making accurate predictions for this type of data crucial. - Most current time series models are deterministic, typically evaluated within a single time window, and use point estimation metrics (such as Mean Absolute Deviation (MAD) or Root Mean Square Error (RMSE)). 2. **Proposed Method**: - The paper proposes tsGT, a generative decoder model based on the Transformer architecture, focusing on simulating the stochastic characteristics of time series data. - tsGT adopts a general architecture without domain-specific inductive biases, which helps reduce training costs and improve model scalability. - To better evaluate model performance, the authors employed a rolling window backtesting protocol, a more robust evaluation method that can assess the model's stability over time. 3. **Problems Addressed**: - **Handling Randomness**: Many current models fail to adequately consider the stochastic nature of time series data, whereas tsGT addresses this issue through its stochastic features. - **Performance Improvement**: tsGT achieved better performance than existing models on 4 commonly used datasets, excelling in metrics such as MAD and RMSE. - **Quantile Loss and Continuous Ranked Probability Score**: For quantile loss (QL) and continuous ranked probability score (CRPS) metrics, tsGT also outperformed other stochastic models. - **Data Distribution Modeling Capability**: tsGT effectively models data distribution, as verified through detailed analysis. - **Robustness Evaluation**: The predictive ability and stability of tsGT were assessed through rolling window analysis and backtesting procedures. In summary, the core objective of this paper is to improve the handling of randomness and uncertainty in time series forecasting tasks through the tsGT model, achieving significant progress across multiple evaluation metrics.