Abstract:In contrastive representation learning, data representation is trained so that it can classify the image instances even when the images are altered by augmentations. However, depending on the datasets, some augmentations can damage the information of the images beyond recognition, and such augmentations can result in collapsed representations. We present a partial solution to this problem by formalizing a stochastic encoding process in which there exist a tug-of-war between the data corruption introduced by the augmentations and the information preserved by the encoder. We show that, with the infoMax objective based on this framework, we can learn a data-dependent distribution of augmentations to avoid the collapse of the representation.

What problem does this paper attempt to address?

This paper attempts to solve the problem in Contrastive Representation Learning (CRL) that some augmentations may destroy image information, leading to collapsed representations. Specifically, the author points out that when using the CRL method, the selected data augmentation operations may unintentionally damage the key features of the image, especially when dealing with different datasets. For example, for the MNIST dataset, if the cropping augmentation is applied and this cropping does not happen to include the digit part, then this augmentation will generate invalid information, thus affecting the learning effect of the model. To solve this problem, the author proposes a new framework. By introducing a trainable augmentation channel, it dynamically adjusts the probability distribution of augmentation operations. In this framework, the augmentation operation is regarded as a random process, and its purpose is to establish a "tug - of - war" between the data corruption introduced by the augmentation operation and the information retained by the encoder. By maximizing the Mutual Information (MI) objective function \(I(X;Z)\), where \(Z\) is the representation of the augmented data \(X\), the author shows how to learn a data - dependent augmentation operation distribution to avoid representation collapse. This method not only improves the effect of representation learning but also provides a new perspective for understanding existing CRL methods (such as simCLR), that is, regarding simCLR as a special case when \(P(T|X)\) is fixed as a uniform distribution. In this way, the author not only solves the representation collapse problem that may be caused by augmentation operations but also provides a new direction for future research, especially in terms of how to select and optimize augmentation operations.

Contrastive Representation Learning with Trainable Augmentation Channel

CONVERT:Contrastive Graph Clustering with Reliable Augmentation

Contrastive Learning With Stronger Augmentations

Rethinking the Augmentation Module in Contrastive Learning: Learning Hierarchical Augmentation Invariance with Expanded Views

What Should Not Be Contrastive in Contrastive Learning

Contrastive Learning Via Equivariant Representation

Contrastive Learning with Consistent Representations

Full-Attention Driven Graph Contrastive Learning: with Effective Mutual Information Insight

MetAug: Contrastive Learning via Meta Feature Augmentation

ContrastCAD: Contrastive Learning-Based Representation Learning for Computer-Aided Design Models

Time Series Contrastive Learning with Information-Aware Augmentations

Adaptive Data Augmentation for Contrastive Learning

Spectral Augmentations for Graph Contrastive Learning

Graph Contrastive Learning with Adaptive Augmentation

The Impact of Spatiotemporal Augmentations on Self-Supervised Audiovisual Representation Learning

CGCL: Collaborative Graph Contrastive Learning without Handcrafted Graph Data Augmentations

Equivariant Representation Learning for Augmentation-based Self-Supervised Learning via Image Reconstruction

Tied-Augment: Controlling Representation Similarity Improves Data Augmentation

Interactive Augmentations, Features, and Parameters for Contrastive Learning

Cap2Aug: Caption guided Image to Image data Augmentation