Abstract:Mixup and its variants form a popular class of data augmentation techniques.Using a random sample pair, it generates a new sample by linear interpolation of the inputs and labels. However, generating only one single interpolation may limit its augmentation ability. In this paper, we propose a simple yet effective extension called multi-mix, which generates multiple interpolations from a sample pair. With an ordered sequence of generated samples, multi-mix can better guide the training process than standard mixup. Moreover, theoretically, this can also reduce the stochastic gradient variance. Extensive experiments on a number of synthetic and large-scale data sets demonstrate that multi-mix outperforms various mixup variants and non-mixup-based baselines in terms of generalization, robustness, and calibration.

What problem does this paper attempt to address?

The paper mainly addresses data augmentation techniques in deep learning—specifically Mixup and its variants—by proposing a new extension method called "Multi-Mix." Mixup is a popular data augmentation technique that generates new training samples by linearly interpolating input samples and their corresponding labels, thereby improving the model's generalization ability and reducing overfitting. However, traditional Mixup generates only a single interpolated sample, which may limit its augmentation capability. ### Overview of the Problem Addressed by the Paper - **Objective**: To propose a simple and effective Mixup extension method (Multi-Mix) to address the limitation of traditional Mixup, which generates only a single interpolated sample. - **Method**: Multi-Mix generates multiple interpolated samples from each pair of samples, arranged sequentially, which can better guide the network training process and theoretically reduce the variance of stochastic gradients. - **Contributions**: - Proposed the Multi-Mix algorithm, which can generate multiple ordered interpolated samples from each sample pair. - Theoretically proved that Multi-Mix can reduce the variance of stochastic gradients. - Experimentally validated the superior performance of Multi-Mix on various tasks, including classification, weakly supervised object localization, and robustness to adversarial background noise. ### Summary of Main Content - **Multi-Interpolation Mixup (Multi-Mix)**: Multi-Mix generates multiple interpolated samples from each sample pair instead of just one. This helps better guide the network training process and theoretically reduces the variance of stochastic gradients. - **Related Work**: The paper reviews related work on Mixup and its variants, including input Mixup, manifold Mixup, and saliency-based Mixup methods. - **Theoretical Analysis**: The paper provides a theoretical proof that Multi-Mix reduces the variance of stochastic gradients. - **Experimental Results**: - On synthetic datasets, Multi-Mix induces smoother decision boundaries and improves test accuracy. - On real-world image classification tasks (e.g., CIFAR-100, Tiny-ImageNet), Multi-Mix shows better classification performance compared to other Mixup variants. - On weakly supervised object localization tasks, Multi-Mix also achieves the best results. - In terms of robustness to adversarial background noise, Multi-Mix demonstrates good robustness. - In transfer learning tasks, models pre-trained with Multi-Mix require fewer fine-tuning steps to achieve good localization performance. In summary, the paper extends the Mixup technique by introducing Multi-Mix, which not only theoretically proves its advantages but also validates its effectiveness and superiority in various experimental settings.

Mixup Augmentation with Multiple Interpolations

MixupE: Understanding and Improving Mixup from Directional Derivative Perspective

Global Mixup: Eliminating Ambiguity with Clustering.

Mixup Without Hesitation

Decoupled Mixup for Data-efficient Learning

ShuffleMix: Improving Representations via Channel-Wise Shuffle of Interpolated Hidden States

Infinite Class Mixup

Harnessing Hard Mixed Samples with Decoupled Regularizer

TransformMix: Learning Transformation and Mixing Strategies from Data

On Mixup Regularization

Local Mixup: Interpolation of closest input signals to prevent manifold intrusion

MetaMixUp: Learning Adaptive Interpolation Policy of MixUp with Meta-Learning

RandoMix: a mixed sample data augmentation method with multiple mixed modes

C-Mixup: Improving Generalization in Regression

SegMix: A Simple Structure-Aware Data Augmentation Method

DoubleMix: Simple Interpolation-Based Data Augmentation for Text Classification

PointMixup: Augmentation for Point Clouds

WeMix: How to Better Utilize Data Augmentation

Tailoring Mixup to Data for Calibration

A Survey on Mixup Augmentations and Beyond

AutoMix: Unveiling the Power of Mixup for Stronger Classifiers