DreamDA: Generative Data Augmentation with Diffusion Models

Yunxiang Fu,Chaoqi Chen,Yu Qiao,Yizhou Yu
2024-03-19
Abstract:The acquisition of large-scale, high-quality data is a resource-intensive and time-consuming endeavor. Compared to conventional Data Augmentation (DA) techniques (e.g. cropping and rotation), exploiting prevailing diffusion models for data generation has received scant attention in classification tasks. Existing generative DA methods either inadequately bridge the domain gap between real-world and synthesized images, or inherently suffer from a lack of diversity. To solve these issues, this paper proposes a new classification-oriented framework DreamDA, which enables data synthesis and label generation by way of diffusion models. DreamDA generates diverse samples that adhere to the original data distribution by considering training images in the original data as seeds and perturbing their reverse diffusion process. In addition, since the labels of the generated data may not align with the labels of their corresponding seed images, we introduce a self-training paradigm for generating pseudo labels and training classifiers using the synthesized data. Extensive experiments across four tasks and five datasets demonstrate consistent improvements over strong baselines, revealing the efficacy of DreamDA in synthesizing high-quality and diverse images with accurate labels. Our code will be available at
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address several key issues in Data Augmentation (DA), particularly in generating high-quality and diverse synthetic data for image classification tasks. Specifically: 1. **Data Scarcity**: Collecting and annotating high-quality large-scale datasets is time-consuming and costly, which limits the effective deployment of deep learning in many applications. 2. **Limitations of Traditional Data Augmentation Methods**: Traditional data augmentation techniques (such as cropping, rotation, etc.) can preserve image semantics well but lack diversity and visual fidelity. 3. **Challenges of Applying Diffusion Models in Data Augmentation**: Although diffusion models can generate highly realistic images, they still face two main challenges in data augmentation tasks: - Design Complexity: The method of achieving conditional mechanisms by creating diverse prompts and optimizing conditional embeddings is complex, which may hinder practical applications. - Trade-off Between Sample Diversity and Fidelity: Simply perturbing latent variables in the diffusion process (i.e., the output of the denoising U-Net) cannot produce sufficient data diversity. To address these issues, the paper proposes a new framework called DreamDA, which leverages pre-trained diffusion models to generate diverse samples that conform to the real data distribution and introduces a self-training paradigm to handle the label inconsistency of synthetic data. Experimental results show that DreamDA significantly improves model performance across multiple datasets.