Abstract:Inspired by the idea of Positive-incentive Noise (Pi-Noise or $\pi$-Noise) that aims at learning the reliable noise beneficial to tasks, we scientifically investigate the connection between contrastive learning and $\pi$-noise in this paper. By converting the contrastive loss to an auxiliary Gaussian distribution to quantitatively measure the difficulty of the specific contrastive model under the information theory framework, we properly define the task entropy, the core concept of $\pi$-noise, of contrastive learning. It is further proved that the predefined data augmentation in the standard contrastive learning paradigm can be regarded as a kind of point estimation of $\pi$-noise. Inspired by the theoretical study, a framework that develops a $\pi$-noise generator to learn the beneficial noise (instead of estimation) as data augmentations for contrast is proposed. The designed framework can be applied to diverse types of data and is also completely compatible with the existing contrastive models. From the visualization, we surprisingly find that the proposed method successfully learns effective augmentations.

What problem does this paper attempt to address?

This paper attempts to address the issue of how to scientifically generate stable data augmentation views in Contrastive Learning. Specifically, existing contrastive learning models often rely on predefined data augmentation methods when handling non-visual data. These methods are usually considered hyperparameters and fail to learn which noises are beneficial. This greatly limits the application and performance improvement of contrastive learning on non-visual data. Therefore, this paper proposes a method based on the Positive-incentive Noise (π-noise) framework to automatically learn beneficial noise as data augmentation, thereby improving the performance of contrastive learning models. ### Main Contributions 1. **Definition of Task Entropy**: By designing an auxiliary Gaussian distribution related to the contrastive loss, the task entropy $H(T)$ of contrastive learning is defined, which is the core concept of the π-noise framework. 2. **Theoretical Analysis**: It is proven that predefined data augmentation in the standard contrastive learning framework can be regarded as a point estimate of π-noise. 3. **Proposing π-noise Generator**: A π-noise generator is proposed to automatically learn π-noise when training the representation module, instead of simply performing point estimation. The π-noise generated by this generator can be applied to any type of data, not limited to visual data. 4. **Experimental Validation**: Experiments were conducted on both visual and non-visual datasets to verify the effectiveness of applying π-noise to contrastive learning. Visualization results show that the proposed noise generator can successfully learn effective data augmentation methods. ### Method Overview 1. **Preliminary Work**: Introduces the basic formula of contrastive learning, including the InfoNCE loss function. 2. **π-noise Optimization Objective**: Based on the definition of π-noise, an objective to maximize the mutual information between the task and the noise is proposed. 3. **Definition of Task Entropy**: Task entropy is defined through contrastive loss, reflecting the difficulty of the contrastive learning task on a given dataset. 4. **Connection between π-noise and Contrastive Learning**: Through theoretical derivation, it is proven that the standard contrastive learning framework is equivalent to optimizing with point-estimated π-noise. 5. **π-noise Driven Data Augmentation**: An automatic learning method for π-noise, namely π-noise driven data augmentation (PiNDA), is proposed. Gradient backpropagation is achieved through Monte Carlo methods and reparameterization techniques. ### Experimental Results - **Non-visual Datasets**: On the HAR and Reuters datasets, PiNDA significantly improved the performance of contrastive learning models, especially when using random Gaussian noise and adversarial examples as baselines. - **Visual Datasets**: On the CIFAR-10, CIFAR-100, and STL-10 datasets, PiNDA also showed good performance improvement, and visualization results indicated that the generated π-noise could effectively enhance image data. In summary, this paper addresses the issue of unstable data augmentation in contrastive learning by introducing the π-noise framework, providing new ideas and methods for the application of contrastive learning on non-visual data.

Data Augmentation of Contrastive Learning is Estimating Positive-incentive Noise

Optimal Positive Generation via Latent Transformation for Contrastive Learning

Boosting Unsupervised Contrastive Learning Using Diffusion-Based Data Augmentation from Scratch

Data Noising as Smoothing in Neural Network Language Models

Do Generated Data Always Help Contrastive Learning?

Full-Attention Driven Graph Contrastive Learning: with Effective Mutual Information Insight

MetAug: Contrastive Learning via Meta Feature Augmentation

Differentiable Data Augmentation for Contrastive Sentence Representation Learning

Towards Efficient Data-Centric Robust Machine Learning with Noise-based Augmentation

Chaos is a Ladder: A New Theoretical Understanding of Contrastive Learning Via Augmentation Overlap

DiffAug: Enhance Unsupervised Contrastive Learning with Domain-Knowledge-Free Diffusion-based Data Augmentation

Information-guided pixel augmentation for pixel-wise contrastive learning

Time Series Contrastive Learning with Information-Aware Augmentations

Interactive Augmentations, Features, and Parameters for Contrastive Learning

Rethinking the Augmentation Module in Contrastive Learning: Learning Hierarchical Augmentation Invariance with Expanded Views

Understanding Contrastive Learning via Gaussian Mixture Models

Improving Contrastive Learning by Visualizing Feature Transformation

Provable Guarantees for Self-Supervised Deep Learning with Spectral Contrastive Loss

Contrasting the landscape of contrastive and non-contrastive learning

Contrastive Data and Learning for Natural Language Processing

Feature Augmentation for Self-supervised Contrastive Learning: A Closer Look