Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model

Zhicai Wang,Longhui Wei,Tan Wang,Heyu Chen,Yanbin Hao,Xiang Wang,Xiangnan He,Qi Tian

2024-03-29

Abstract:Text-to-image (T2I) generative models have recently emerged as a powerful tool, enabling the creation of photo-realistic images and giving rise to a multitude of applications. However, the effective integration of T2I models into fundamental image classification tasks remains an open question. A prevalent strategy to bolster image classification performance is through augmenting the training set with synthetic images generated by T2I models. In this study, we scrutinize the shortcomings of both current generative and conventional data augmentation techniques. Our analysis reveals that these methods struggle to produce images that are both faithful (in terms of foreground objects) and diverse (in terms of background contexts) for domain-specific concepts. To tackle this challenge, we introduce an innovative inter-class data augmentation method known as Diff-Mix (

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The paper attempts to address the problem of how to improve classification performance in image classification tasks within specific domains through data augmentation methods using generative models. Specifically, existing generative models and traditional data augmentation techniques often struggle to simultaneously maintain image fidelity (the realism of the foreground object) and diversity (variations in the background environment) when generating images for specific domain concepts. The paper points out that these methods find it difficult to achieve both high fidelity and high diversity when generating samples for specific domain datasets, which limits their effectiveness in image classification tasks. To solve this problem, the paper proposes a new cross-category data augmentation method called Diff-Mix. This method enriches the dataset by performing image translation between different categories, thereby increasing background diversity while maintaining image fidelity. Experiments demonstrate that Diff-Mix significantly improves performance in various image classification scenarios, including few-shot, regular, and long-tail classification.

Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model

Diversified text-to-image generation via deep mutual information estimation

Decoupled Data Augmentation for Improving Image Classification

GenMix: Effective Data Augmentation with Generative Diffusion Model Image Editing

Inversion Circle Interpolation: Diffusion-based Image Augmentation for Data-scarce Classification

DiffuseMix: Label-Preserving Data Augmentation with Diffusion Models

TTIDA: Controllable Generative Data Augmentation via Text-to-Text and Text-to-Image Models

A Simple Background Augmentation Method for Object Detection with Diffusion Model

Semantic-Guided Generative Image Augmentation Method with Diffusion Models for Image Classification

GenMix: Combining Generative and Mixture Data Augmentation for Medical Image Classification

Effective Data Augmentation With Diffusion Models

Synthetic Data from Diffusion Models Improves ImageNet Classification

ITMix: Image-Text Mix Augmentation for Transferring CLIP to Image Classification

Not Just Pretty Pictures: Toward Interventional Data Augmentation Using Text-to-Image Generators

TweedieMix: Improving Multi-Concept Fusion for Diffusion-based Image/Video Generation

Probabilistic Interpolation with Mixup Data Augmentation for Text Classification

S3Mix: Same Category Same Semantics Mixing for Augmenting Fine-grained Images

DIAGen: Diverse Image Augmentation with Generative Models

Diffusion Cocktail: Mixing Domain-Specific Diffusion Models for Diversified Image Generations