Mixup Augmentation with Multiple Interpolations

Lifeng Shen,Jincheng Yu,Hansi Yang,James T. Kwok
2024-06-03
Abstract:Mixup and its variants form a popular class of data augmentation techniques.Using a random sample pair, it generates a new sample by linear interpolation of the inputs and labels. However, generating only one single interpolation may limit its augmentation ability. In this paper, we propose a simple yet effective extension called multi-mix, which generates multiple interpolations from a sample pair. With an ordered sequence of generated samples, multi-mix can better guide the training process than standard mixup. Moreover, theoretically, this can also reduce the stochastic gradient variance. Extensive experiments on a number of synthetic and large-scale data sets demonstrate that multi-mix outperforms various mixup variants and non-mixup-based baselines in terms of generalization, robustness, and calibration.
Machine Learning,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper mainly addresses data augmentation techniques in deep learning—specifically Mixup and its variants—by proposing a new extension method called "Multi-Mix." Mixup is a popular data augmentation technique that generates new training samples by linearly interpolating input samples and their corresponding labels, thereby improving the model's generalization ability and reducing overfitting. However, traditional Mixup generates only a single interpolated sample, which may limit its augmentation capability. ### Overview of the Problem Addressed by the Paper - **Objective**: To propose a simple and effective Mixup extension method (Multi-Mix) to address the limitation of traditional Mixup, which generates only a single interpolated sample. - **Method**: Multi-Mix generates multiple interpolated samples from each pair of samples, arranged sequentially, which can better guide the network training process and theoretically reduce the variance of stochastic gradients. - **Contributions**: - Proposed the Multi-Mix algorithm, which can generate multiple ordered interpolated samples from each sample pair. - Theoretically proved that Multi-Mix can reduce the variance of stochastic gradients. - Experimentally validated the superior performance of Multi-Mix on various tasks, including classification, weakly supervised object localization, and robustness to adversarial background noise. ### Summary of Main Content - **Multi-Interpolation Mixup (Multi-Mix)**: Multi-Mix generates multiple interpolated samples from each sample pair instead of just one. This helps better guide the network training process and theoretically reduces the variance of stochastic gradients. - **Related Work**: The paper reviews related work on Mixup and its variants, including input Mixup, manifold Mixup, and saliency-based Mixup methods. - **Theoretical Analysis**: The paper provides a theoretical proof that Multi-Mix reduces the variance of stochastic gradients. - **Experimental Results**: - On synthetic datasets, Multi-Mix induces smoother decision boundaries and improves test accuracy. - On real-world image classification tasks (e.g., CIFAR-100, Tiny-ImageNet), Multi-Mix shows better classification performance compared to other Mixup variants. - On weakly supervised object localization tasks, Multi-Mix also achieves the best results. - In terms of robustness to adversarial background noise, Multi-Mix demonstrates good robustness. - In transfer learning tasks, models pre-trained with Multi-Mix require fewer fine-tuning steps to achieve good localization performance. In summary, the paper extends the Mixup technique by introducing Multi-Mix, which not only theoretically proves its advantages but also validates its effectiveness and superiority in various experimental settings.