Inversion Circle Interpolation: Diffusion-based Image Augmentation for Data-scarce Classification

Yanghao Wang,Long Chen
2024-11-21
Abstract:Data Augmentation (DA), i.e., synthesizing faithful and diverse samples to expand the original training set, is a prevalent and effective strategy to improve the performance of various data-scarce tasks. With the powerful image generation ability, diffusion-based DA has shown strong performance gains on different image classification benchmarks. In this paper, we analyze today's diffusion-based DA methods, and argue that they cannot take account of both faithfulness and diversity, which are two critical keys for generating high-quality samples and boosting classification performance. To this end, we propose a novel Diffusion-based DA method: Diff-II. Specifically, it consists of three steps: 1) Category concepts learning: Learning concept embeddings for each category. 2) Inversion interpolation: Calculating the inversion for each image, and conducting circle interpolation for two randomly sampled inversions from the same category. 3) Two-stage denoising: Using different prompts to generate synthesized images in a coarse-to-fine manner. Extensive experiments on various data-scarce image classification tasks (e.g., few-shot, long-tailed, and out-of-distribution classification) have demonstrated its effectiveness over state-of-the-art diffusion-based DA methods.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper attempts to solve the data augmentation problem in image classification tasks under data - scarce situations. Specifically, although current data augmentation methods based on diffusion models can generate realistic images, they are insufficient in generating samples that are both faithful to the original categories and diverse. This leads to the limited generalization ability of downstream classifiers. Therefore, the paper proposes a new data augmentation method based on the diffusion model - Diff - II, which aims to generate both faithful and diverse augmented images through three steps (category concept learning, reverse interpolation, and two - stage denoising), thereby improving the classification performance under data - scarce situations.