Abstract:Through minimization of an appropriate loss function such as the InfoNCE loss, contrastive learning (CL) learns a useful representation function by pulling positive samples close to each other while pushing negative samples far apart in the embedding space. The positive samples are typically created using "label-preserving" augmentations, i.e., domain-specific transformations of a given datum or anchor. In absence of class information, in unsupervised CL (UCL), the negative samples are typically chosen randomly and independently of the anchor from a preset negative sampling distribution over the entire dataset. This leads to class-collisions in UCL. Supervised CL (SCL), avoids this class collision by conditioning the negative sampling distribution to samples having labels different from that of the anchor. In hard-UCL (H-UCL), which has been shown to be an effective method to further enhance UCL, the negative sampling distribution is conditionally tilted, by means of a hardening function, towards samples that are closer to the anchor. Motivated by this, in this paper we propose hard-SCL (H-SCL) {wherein} the class conditional negative sampling distribution {is tilted} via a hardening function. Our simulation results confirm the utility of H-SCL over SCL with significant performance gains {in downstream classification tasks.} Analytically, we show that {in the} limit of infinite negative samples per anchor and a suitable assumption, the {H-SCL loss} is upper bounded by the {H-UCL loss}, thereby justifying the utility of H-UCL {for controlling} the H-SCL loss in the absence of label information. Through experiments on several datasets, we verify the assumption as well as the claimed inequality between H-UCL and H-SCL losses. We also provide a plausible scenario where H-SCL loss is lower bounded by UCL loss, indicating the limited utility of UCL in controlling the H-SCL loss.

Solving Data Imbalance in Text Classification with Constructing Contrastive Samples

Debiased Graph Contrastive Learning.

Constructing Contrastive Samples Via Summarization for Text Classification with Limited Annotations

CLAF: Contrastive Learning with Augmented Features for Imbalanced Semi-Supervised Learning

An Asymmetric Contrastive Loss for Handling Imbalanced Datasets

Contrastive learning with text augmentation for text classification

Understanding Contrastive Learning via Distributionally Robust Optimization

Simple-Sampling and Hard-Mixup with Prototypes to Rebalance Contrastive Learning for Text Classification

Synthetic Hard Negative Samples for Contrastive Learning

Self-Damaging Contrastive Learning

Contrastive Learning with Negative Sampling Correction

An Effective Deployment of Contrastive Learning in Multi-label Text Classification

Imbalanced Sentiment Classification Enhanced with Discourse Marker

Boosting Graph Contrastive Learning Via Adaptive Sampling

Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning

Effective Sample Pairs Based Contrastive Learning for Clustering

Class-Aware Contrastive Optimization for Imbalanced Text Classification

A Novel Contrast Co-learning Framework for Generating High Quality Training Data

Supervised Contrastive Learning with Hard Negative Samples

Conditional Supervised Contrastive Learning for Fair Text Classification

An imbalanced contrastive classification method via similarity comparison within sample-neighbors with adaptive generation coefficient