Embarrassingly Simple MixUp for Time-series

Karan Aggarwal,Jaideep Srivastava

2023-04-10

Abstract:Labeling time series data is an expensive task because of domain expertise and dynamic nature of the data. Hence, we often have to deal with limited labeled data settings. Data augmentation techniques have been successfully deployed in domains like computer vision to exploit the use of existing labeled data. We adapt one of the most commonly used technique called MixUp, in the time series domain. Our proposed, MixUp++ and LatentMixUp++, use simple modifications to perform interpolation in raw time series and classification model's latent space, respectively. We also extend these methods with semi-supervised learning to exploit unlabeled data. We observe significant improvements of 1\% - 15\% on time series classification on two public datasets, for both low labeled data as well as high labeled data regimes, with LatentMixUp++.

Machine Learning,Artificial Intelligence

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper primarily addresses two core issues in the annotation process of time series data: 1. **High annotation cost**: Due to the dynamic and temporal nature of time series data, annotation requires domain experts, making the process very time-consuming and expensive. 2. **Limited annotated data**: Especially in fields like healthcare, precise annotation is crucial, which often results in only a limited amount of annotated data being available. To tackle these problems, the paper proposes a time series data augmentation method based on MixUp technology—MixUp++ and LatentMixUp++. These methods generate synthetic samples by interpolating data in both the original time and the latent space of classification models. They were validated on two public datasets (human activity recognition and sleep staging). Experimental results show that these methods significantly improve the performance of time series classification under both low and high annotation data scenarios. Notably, LatentMixUp++ performs exceptionally well under low annotation data conditions. Additionally, the paper extends these methods to a semi-supervised learning environment, further leveraging unannotated data through pseudo-labeling. This approach is particularly effective in low annotation data scenarios, significantly enhancing model performance.

Embarrassingly Simple MixUp for Time-series

Mixup Augmentation with Multiple Interpolations

TransformMix: Learning Transformation and Mixing Strategies from Data

Decoupled Mixup for Data-efficient Learning

MixupE: Understanding and Improving Mixup from Directional Derivative Perspective

Harnessing Hard Mixed Samples with Decoupled Regularizer

Empirical Study of Mix-based Data Augmentation Methods in Physiological Time Series Data

Mixup Without Hesitation

Global Mixup: Eliminating Ambiguity with Clustering.

Self-Evolution Learning for Mixup: Enhance Data Augmentation on Few-Shot Text Classification Tasks

A Data Cartography based MixUp for Pre-trained Language Models

ShuffleMix: Improving Representations via Channel-Wise Shuffle of Interpolated Hidden States

On Mixup Regularization

C-Mixup: Improving Generalization in Regression

ISM: intra-class similarity mixing for time series augmentation

OpenMixup: A Comprehensive Mixup Benchmark for Visual Classification

Local Mixup: Interpolation of closest input signals to prevent manifold intrusion

DoubleMix: Simple Interpolation-Based Data Augmentation for Text Classification

Augment on Manifold: Mixup Regularization with UMAP

MetaMixUp: Learning Adaptive Interpolation Policy of MixUp with Meta-Learning

mixup: Beyond Empirical Risk Minimization