Abstract:Data augmentation (DA) has been widely used to improve the generalization of deep neural networks. While existing DA methods have proven effective, they often rely on augmentation operations with random magnitudes to each sample. However, this approach can inadvertently introduce noise, induce distribution shifts, and increase the risk of overfitting. In this paper, we propose EntAugment, a tuning-free and adaptive DA framework. Unlike previous work, EntAugment dynamically assesses and adjusts the augmentation magnitudes for each sample during training, leveraging insights into both the inherent complexities of training samples and the evolving status of deep models. Specifically, in EntAugment, the magnitudes are determined by the information entropy derived from the probability distribution obtained by applying the softmax function to the model's output. In addition, to further enhance the efficacy of EntAugment, we introduce a novel entropy regularization term, EntLoss, which complements the EntAugment approach. Theoretical analysis further demonstrates that EntLoss, compared to traditional cross-entropy loss, achieves closer alignment between the model distributions and underlying dataset distributions. Moreover, EntAugment and EntLoss can be utilized separately or jointly. We conduct extensive experiments across multiple image classification tasks and network architectures with thorough comparisons of existing DA methods. Importantly, the proposed methods outperform others without introducing any auxiliary models or noticeable extra computational costs, highlighting both effectiveness and efficiency. Code is available at <a class="link-external link-https" href="https://github.com/Jackbrocp/EntAugment" rel="external noopener nofollow">this https URL</a>.

Tied-Augment: Controlling Representation Similarity Improves Data Augmentation

Boosting Unsupervised Contrastive Learning Using Diffusion-Based Data Augmentation from Scratch

Data-Efficient Augmentation for Training Neural Networks

DualAug: Exploiting Additional Heavy Augmentation with OOD Data Rejection

KeepAugment: A Simple Information-Preserving Data Augmentation Approach

WeMix: How to Better Utilize Data Augmentation

AdaAugment: A Tuning-Free and Adaptive Approach to Enhance Data Augmentation

Effective Data Augmentation With Diffusion Models

Data Augmentation Revisited: Rethinking the Distribution Gap between Clean and Augmented Data

Augment the Pairs: Semantics-Preserving Image-Caption Pair Augmentation for Grounding-Based Vision and Language Models

EntAugment: Entropy-Driven Adaptive Data Augmentation Framework for Image Classification

RandMSAugment: A Mixed-Sample Augmentation for Limited-Data Scenarios

A Good Data Augmentation Policy Is Not All You Need: A Multi-Task Learning Perspective

Boosting Model Resilience via Implicit Adversarial Data Augmentation

Data Augmentation Can Improve Robustness

Augmentation Invariant Training

Revisiting Data Augmentation in Deep Reinforcement Learning

Adaptive Data Augmentation for Contrastive Learning

Safe Augmentation: Learning Task-Specific Transformations from Data

TeachAugment: Data Augmentation Optimization Using Teacher Knowledge