Abstract:State-of-the-art approaches in time series generation (TSG), such as TimeVQVAE, utilize vector quantization-based tokenization to effectively model complex distributions of time series. These approaches first learn to transform time series into a sequence of discrete latent vectors, and then a prior model is learned to model the sequence. The discrete latent vectors, however, only capture low-level semantics (\textit{e.g.,} shapes). We hypothesize that higher-fidelity time series can be generated by training a prior model on more informative discrete latent vectors that contain both low and high-level semantics (\textit{e.g.,} characteristic dynamics). In this paper, we introduce a novel framework, termed NC-VQVAE, to integrate self-supervised learning into those TSG methods to derive a discrete latent space where low and high-level semantics are captured. Our experimental results demonstrate that NC-VQVAE results in a considerable improvement in the quality of synthetic samples.
What problem does this paper attempt to address?
### What problem does this paper attempt to solve?
This paper aims to solve the problem that Time Series Generation (TSG) models can only capture low - level semantic information. Existing methods, such as TimeVQVAE, convert time series into discrete latent vectors through Vector Quantization (VQ), and these vectors mainly capture low - level semantic information such as local shapes. However, the authors believe that in order to generate higher - quality time - series data, the model should be able to capture both low - level and high - level semantic information (for example, feature dynamics).
To this end, the authors propose a new framework, NC - VQVAE (Non - Contrastive VQVAE), which combines Self - Supervised Learning (SSL) techniques to encode both low - level and high - level semantic information in the discrete latent space simultaneously. Specifically, the authors introduce a non - contrastive self - supervised loss function, enabling the model to capture not only local shapes but also high - level semantic information such as feature dynamics in the time series.
### Main contributions
1. **Introducing the NC - VQVAE framework**: By combining non - contrastive self - supervised learning, the traditional VQVAE model is improved so that it can capture both low - level and high - level semantic information in the discrete latent space simultaneously.
2. **Improving the quality of synthetic samples**: Experimental results show that NC - VQVAE significantly outperforms the traditional VQVAE model on multiple evaluation metrics, including classification accuracy, Inception Score (IS), and Fréchet Inception Distance (FID) scores.
3. **Verifying the effectiveness of SSL**: Research shows that self - supervised learning methods (such as Barlow Twins and VIbCReg) can significantly improve the performance of time - series generation models, especially on complex datasets.
### Method overview
- **Stage 1: Tokenization**: Train the VQVAE model using a non - contrastive self - supervised loss function so that the discrete latent representation can contain both low - level and high - level semantic information simultaneously.
- **Stage 2: Prior Learning**: Use the latent representation learned in stage 1 to train the generation model to further improve the quality of synthetic samples.
### Experimental results
- **Reconstruction and classification capabilities**: The reconstruction loss of NC - VQVAE on multiple datasets is comparable to that of the traditional VQVAE, but the classification accuracy is significantly improved.
- **IS and FID scores**: NC - VQVAE obtains higher IS scores and lower FID scores on most datasets, indicating that the samples it generates are closer to the real data.
- **Visual inspection**: Through t - SNE and PCA visualization, the samples generated by NC - VQVAE show a better structured distribution in the latent space.
In conclusion, this paper significantly improves the capabilities of time - series generation models by introducing self - supervised learning techniques, especially in capturing high - level semantic information of complex time series.