Diverse Generation while Maintaining Semantic Coordination: A Diffusion-Based Data Augmentation Method for Object Detection

Sen Nie,Zhuo Wang,Xinxin Wang,Kun He

2024-08-06

Abstract:Recent studies emphasize the crucial role of data augmentation in enhancing the performance of object detection models. However,existing methodologies often struggle to effectively harmonize dataset diversity with semantic <a class="link-external link-http" href="http://coordination.To" rel="external noopener nofollow">this http URL</a> bridge this gap, we introduce an innovative augmentation technique leveraging pre-trained conditional diffusion models to mediate this balance. Our approach encompasses the development of a Category Affinity Matrix, meticulously designed to enhance dataset diversity, and a Surrounding Region Alignment strategy, which ensures the preservation of semantic coordination in the augmented images. Extensive experimental evaluations confirm the efficacy of our method in enriching dataset diversity while seamlessly maintaining semantic coordination. Our method yields substantial average improvements of +1.4AP, +0.9AP, and +3.4AP over existing alternatives on three distinct object detection models, respectively.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to maintain semantic harmony while enhancing the diversity of the dataset in object detection tasks. Existing data augmentation methods often struggle to find a balance between the two: some methods introduce minor modifications through geometric transformations and the like. Although they maintain the overall semantic consistency of the image, their diversity is limited; while other methods based on generative models can increase the diversity of the dataset, but they face challenges in maintaining the semantic harmony of the image. To overcome these limitations, the author proposes a data augmentation method based on the diffusion model, which achieves the goal through the following two key techniques: 1. **Category Affinity Matrix**: By calculating the visual and semantic similarities of different categories, a matrix is constructed to guide the generative model to select objects with an affinity relationship when replacing the original objects, thereby appropriately enhancing the diversity of the dataset. 2. **Surrounding Region Alignment**: By extracting information from the original diffusion process and combining it with the new diffusion process, the semantic integrity of the generated image is ensured, and the potential semantic disconnection problem between the generated object and the background is solved. The experimental results show that this method achieves an average performance improvement of +1.4AP, +0.9AP, and +3.4AP on three different object detection models respectively, demonstrating its effectiveness in increasing dataset diversity and maintaining semantic harmony. In addition, this method also performs well on specific categories and fine - grained datasets, with improvements of +3.6AP and +4.4AP respectively.

Diverse Generation while Maintaining Semantic Coordination: A Diffusion-Based Data Augmentation Method for Object Detection

Inference Fusion with Associative Semantics for Unseen Object Detection

Boosting Unsupervised Contrastive Learning Using Diffusion-Based Data Augmentation from Scratch

A Simple Background Augmentation Method for Object Detection with Diffusion Model

A Data Augmentation Method Based on Multi-Modal Image Fusion for Detection and Segmentation

Exploring Data Augmentation for Multi-Modality 3D Object Detection

AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model

Diff-Mosaic: Augmenting Realistic Representations in Infrared Small Target Detection via Diffusion Prior

Enhancing Monocular 3-D Object Detection Through Data Augmentation Strategies

DIAGen: Diverse Image Augmentation with Generative Models

Improving 3D Object Detection through Progressive Population Based Augmentation

Effective Data Augmentation With Diffusion Models

Semantic-Guided Generative Image Augmentation Method with Diffusion Models for Image Classification

Decoupled Data Augmentation for Improving Image Classification

AeroGen: Enhancing Remote Sensing Object Detection with Diffusion-Driven Data Generation

Enhanced Generative Data Augmentation for Semantic Segmentation via Stronger Guidance

Augmenting 3-D Object Detection Through Data Uncertainty-Driven Auxiliary Framework

DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception

Detail Reinforcement Diffusion Model: Augmentation Fine-Grained Visual Categorization in Few-Shot Conditions

A Good Data Augmentation Policy Is Not All You Need: A Multi-Task Learning Perspective