Abstract:Sequence-to-sequence models based on LSTM and GRU are a most popular choice for forecasting time series data reaching state-of-the-art performance. Training such models can be delicate though. The two most common training strategies within this context are teacher forcing (TF) and free running (FR). TF can be used to help the model to converge faster but may provoke an exposure bias issue due to a discrepancy between training and inference phase. FR helps to avoid this but does not necessarily lead to better results, since it tends to make the training slow and unstable instead. Scheduled sampling was the first approach tackling these issues by picking the best from both worlds and combining it into a curriculum learning (CL) strategy. Although scheduled sampling seems to be a convincing alternative to FR and TF, we found that, even if parametrized carefully, scheduled sampling may lead to premature termination of the training when applied for time series forecasting. To mitigate the problems of the above approaches we formalize CL strategies along the training as well as the training iteration scale. We propose several new curricula, and systematically evaluate their performance in two experimental sets. For our experiments, we utilize six datasets generated from prominent chaotic systems. We found that the newly proposed increasing training scale curricula with a probabilistic iteration scale curriculum consistently outperforms previous training strategies yielding an NRMSE improvement of up to 81% over FR or TF training. For some datasets we additionally observe a reduced number of training iterations. We observed that all models trained with the new curricula yield higher prediction stability allowing for longer prediction horizons.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the defects existing in common training strategies (such as Teacher Forcing, TF, and Free Running, FR) in time - series prediction, especially the deficiencies shown by these strategies when dealing with time - series data of chaotic systems. Specifically: 1. **Teacher Forcing (TF)**: Although it can accelerate the model convergence speed, there are differences in the data distribution during the training and inference stages, which leads to the exposure bias of the model when facing its own predicted values. That is, the model has never been exposed to its own wrong predictions during the training process, so it has a low tolerance for small errors in practical applications, which limits its performance in the long - prediction range. 2. **Free Running (FR)**: Although it can avoid exposure bias and improve the robustness of the model, it will cause the training process to be slow and unstable and may not reach the best performance. 3. **Limitations of existing solutions**: For example, Scheduled Sampling attempts to combine the advantages of TF and FR, but when applied to time - series prediction, even if the parameters are set properly, it may lead to premature termination of training. To solve the above problems, the author proposes a new Curriculum Learning (CL) strategy. By dynamically adjusting the proportion of TF and FR during the training process, it aims to improve the prediction stability and accuracy of the model, especially when dealing with data from chaotic systems. The main contributions of the paper are: - Proposing a series of new curriculum learning strategies and systematically evaluating their performance on different datasets. - Discovering that the newly proposed Increasing Training Scale Curricula combined with the Probabilistic Iteration Scale Curriculum can significantly outperform the traditional TF and FR training methods, with the NRMSE (Normalized Root Mean Square Error) improved by up to 81%. - The new strategy also reduces the number of training iterations and improves the prediction stability of the model, allowing for longer - term predictions. In conclusion, this paper improves the prediction performance and stability of the model by improving the training strategy of the time - series prediction model, especially for the time - series data of chaotic systems.

Flipped Classroom: Effective Teaching for Time Series Forecasting

A Novel Sequence-to-Sequence-Based Deep Learning Model for Multistep Load Forecasting.

Effective LSTMs with Seasonal-Trend Decomposition and Adaptive Learning and Niching-Based Backtracking Search Algorithm for Time Series Forecasting

Sequence Modeling with Recurrent Neural Networks (RNNs) for Student Learning Behavior Pattern Recognition in a Flipped Classroom

Enhancing Spatio-temporal Quantile Forecasting with Curriculum Learning: Lessons Learned

Time Series Forecasting (TSF) Using Various Deep Learning Models

Real-time Forecasting of Time Series in Financial Markets Using Sequentially Trained Many-to-one LSTMs

Phase-Space-Guided Deep Learning for Time Series Forecasting

Training and Evaluating Causal Forecasting Models for Time-Series

Learning to forecast: The probabilistic time series forecasting challenge

Do We Really Need Deep Learning Models for Time Series Forecasting?

What Constitutes Good Contrastive Learning in Time-Series Forecasting?

Predicting student performance using sequence classification with time-based windows

Enabling Time-series Foundation Model for Building Energy Forecasting via Contrastive Curriculum Learning

Test Time Learning for Time Series Forecasting

Time-Series Classification for Dynamic Strategies in Multi-Step Forecasting

A Time Series Forecasting Model Selection Framework using CNN and Data Augmentation for Small Sample Data

Time-LLM: Time Series Forecasting by Reprogramming Large Language Models

Time Series Representation Models

Making Good on LSTMs' Unfulfilled Promise

Learning Structured Components: Towards Modular and Interpretable Multivariate Time Series Forecasting