DiffuseMix: Label-Preserving Data Augmentation with Diffusion Models

Khawar Islam,Muhammad Zaigham Zaheer,Arif Mahmood,Karthik Nandakumar

2024-04-05

Abstract:Recently, a number of image-mixing-based augmentation techniques have been introduced to improve the generalization of deep neural networks. In these techniques, two or more randomly selected natural images are mixed together to generate an augmented image. Such methods may not only omit important portions of the input images but also introduce label ambiguities by mixing images across labels resulting in misleading supervisory signals. To address these limitations, we propose DiffuseMix, a novel data augmentation technique that leverages a diffusion model to reshape training images, supervised by our bespoke conditional prompts. First, concatenation of a partial natural image and its generated counterpart is obtained which helps in avoiding the generation of unrealistic images or label ambiguities. Then, to enhance resilience against adversarial attacks and improves safety measures, a randomly selected structural pattern from a set of fractal images is blended into the concatenated image to form the final augmented image for training. Our empirical results on seven different datasets reveal that DiffuseMix achieves superior performance compared to existing state-of the-art methods on tasks including general classification,fine-grained classification, fine-tuning, data scarcity, and adversarial robustness. Augmented datasets and codes are available here:

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The paper attempts to address several key issues present in existing image mixing data augmentation methods in deep learning model training. Specifically: 1. **Label Ambiguity**: Existing data augmentation techniques generate new augmented images by mixing images of different categories, which may lead to unclear labels and thus misleading supervision signals. 2. **Important Region Omission**: Existing techniques may miss important parts of the input images when mixing images. 3. **Cost and Limitations of Saliency Detection Dependence**: Some studies attempt to introduce saliency detection-based methods to alleviate the above issues, but these methods are not only costly but also limited in effectiveness. To address these issues, the paper proposes the DIFFUSE MIX method, a new data augmentation technique that generates images using a diffusion model. DIFFUSE MIX is implemented through the following steps: 1. **Conditional Prompt Generation**: Generate images from the diffusion model using conditional prompts (e.g., "autumn scenery," "snow scene"). 2. **Splicing Original and Generated Images**: Splice parts of the original image with parts of the generated image to form a mixed image, preserving key semantic information. 3. **Fractal Image Fusion**: Fuse randomly selected fractal images with the mixed image to obtain the final augmented image for training, enhancing structural diversity and avoiding overfitting to the generated content. Experimental results show that DIFFUSE MIX outperforms existing state-of-the-art data augmentation methods on multiple benchmark datasets, with significant improvements in general classification, fine-grained classification, adversarial robustness, transfer learning, and data scarcity tasks.

DiffuseMix: Label-Preserving Data Augmentation with Diffusion Models

Boosting Unsupervised Contrastive Learning Using Diffusion-Based Data Augmentation from Scratch

GenMix: Effective Data Augmentation with Generative Diffusion Model Image Editing

TransformMix: Learning Transformation and Mixing Strategies from Data

WeMix: How to Better Utilize Data Augmentation

MiAMix: Enhancing Image Classification through a Multi-stage Augmented Mixed Sample Data Augmentation Method

Effective Data Augmentation With Diffusion Models

ConfidentMix: Confidence-Guided Mixup for Learning With Noisy Labels

Survey: Image mixing and deleting for data augmentation

Diff-Mosaic: Augmenting Realistic Representations in Infrared Small Target Detection via Diffusion Prior

PatchMix: patch-level mixup for data augmentation in convolutional neural networks

MixPro: Data Augmentation with MaskMix and Progressive Attention Labeling for Vision Transformer

SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data

MixGen: A New Multi-Modal Data Augmentation

DiffAug: A Diffuse-and-Denoise Augmentation for Training Robust Classifiers

DP-Mix: Mixup-based Data Augmentation for Differentially Private Learning

A Survey on Mixup Augmentations and Beyond

RandoMix: a mixed sample data augmentation method with multiple mixed modes

Global Mixup: Eliminating Ambiguity with Clustering.

Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model