A Systematic Evaluation of Generated Time Series and Their Effects in Self-Supervised Pretraining

Audrey Der,Chin-Chia Michael Yeh,Xin Dai,Huiyuan Chen,Yan Zheng,Yujie Fan,Zhongfang Zhuang,Vivian Lai,Junpeng Wang,Liang Wang,Wei Zhang,Eamonn Keogh
2024-08-15
Abstract:Self-supervised Pretrained Models (PTMs) have demonstrated remarkable performance in computer vision and natural language processing tasks. These successes have prompted researchers to design PTMs for time series data. In our experiments, most self-supervised time series PTMs were surpassed by simple supervised models. We hypothesize this undesired phenomenon may be caused by data scarcity. In response, we test six time series generation methods, use the generated data in pretraining in lieu of the real data, and examine the effects on classification performance. Our results indicate that replacing a real-data pretraining set with a greater volume of only generated samples produces noticeable improvement.
Machine Learning
What problem does this paper attempt to address?