Towards Better Long-range Time Series Forecasting using Generative Adversarial Networks

Shiyu Liu,Rohan Ghosh,Mehul Motani
DOI: https://doi.org/10.48550/arXiv.2110.08770
2022-08-05
Abstract:Long-range time series forecasting is usually based on one of two existing forecasting strategies: Direct Forecasting and Iterative Forecasting, where the former provides low bias, high variance forecasts and the later leads to low variance, high bias forecasts. In this paper, we propose a new forecasting strategy called Generative Forecasting (GenF), which generates synthetic data for the next few time steps and then makes long-range forecasts based on generated and observed data. We theoretically prove that GenF is able to better balance the forecasting variance and bias, leading to a much smaller forecasting error. We implement GenF via three components: (i) a novel conditional Wasserstein Generative Adversarial Network (GAN) based generator for synthetic time series data generation, called CWGAN-TS. (ii) a transformer based predictor, which makes long-range predictions using both generated and observed data. (iii) an information theoretic clustering algorithm to improve the training of both the CWGAN-TS and the transformer based predictor. The experimental results on five public datasets demonstrate that GenF significantly outperforms a diverse range of state-of-the-art benchmarks and classical approaches. Specifically, we find a 5% - 11% improvement in predictive performance (mean absolute error) while having a 15% - 50% reduction in parameters compared to the benchmarks. Lastly, we conduct an ablation study to demonstrate the effectiveness of the components comprising GenF.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the key problems in long - range time - series prediction. Specifically, it proposes a new prediction method in view of the shortcomings of the two existing main prediction strategies - Direct Forecasting (DF) and Iterative Forecasting (IF). #### Limitations of existing methods 1. **Direct Forecasting (DF)**: - **Advantages**: Provides predictions with low bias and high variance. - **Disadvantages**: As the number of prediction steps \( N \) increases, the prediction performance degrades (i.e., the variance increases). For example, when the optimal prediction is a linear trend, DF may generate a broken curve because it does not utilize the dependencies between the synthetic data. 2. **Iterative Forecasting (IF)**: - **Advantages**: Provides predictions with low variance and high bias. - **Disadvantages**: Due to the error propagation easily generated in the synthetic data generated in a recursive supervised manner, the prediction performance degrades (i.e., the bias increases) as \( N \) increases. #### The proposed new method To solve the above problems, the author proposes a new strategy called Generative Forecasting (GenF). GenF mainly improves long - range time - series prediction in the following ways: 1. **Generate synthetic data**: GenF first generates synthetic data \( \tilde{X}_{M + 1},\ldots,\tilde{X}_{M + L} \) for the next \( L \) time steps based on the past \( M \) observations. 2. **Long - range prediction**: Concatenate the past real observations \( X_1,\ldots,X_M \) and the generated synthetic data \( \tilde{X}_{M + 1},\ldots,\tilde{X}_{M + L} \) together, and keep the window size as \( M \), and then use these data for long - range prediction. #### Key innovation points - **Balance bias and variance**: It is theoretically proven that GenF can better balance the bias and variance of the prediction, thereby reducing the prediction error. - **Flexibility**: Unlike IF, the synthetic window length \( L \) of GenF does not depend on the number of prediction steps \( N \), but is flexibly adjustable. Adjusting the value of \( L \) can make a trade - off between bias and variance. - **Component implementation**: GenF is implemented through three components: 1. **Conditional Wasserstein GAN (CWGAN - TS)**: Used to generate synthetic time - series data. 2. **Transformer - based predictor**: Uses the generated and observed data for long - range prediction. 3. **Information - theoretic clustering algorithm (ITC)**: Used to improve the training of CWGAN - TS and the Transformer predictor. ### Experimental results The experimental results show that GenF significantly outperforms a variety of state - of - the - art benchmark methods (including classical and the latest methods) on five public datasets, specifically: - The prediction performance (Mean Absolute Error, MAE) is improved by 5% to 11%. - The number of parameters is reduced by 15% to 50%. In addition, the ablation study further verifies the effectiveness of each component of GenF. In summary, this paper solves the problem that the bias and variance are difficult to balance in the existing long - range time - series prediction methods by introducing the generative prediction method, and provides more accurate and efficient prediction performance.