Abstract:As one representative framework of self-supervised learning (SSL), contrastive learning (CL) has drawn enormous attention in the representation learning area. By pulling together a “positive” example and an anchor, as well as pushing away many “negative” examples from the anchor, CL is able to generate high-quality representations for the data of different modalities. Therefore, the qualities of selected positive and negative examples are critical for the performance of CL-based models. However, due to the assumption of label unavailability, most existing work follows the paradigm of contrastive instance discrimination, which treats each input instance as an individual category. Therefore, they focused more on positive example generation and designed plenty of data augmentation strategies. For negative examples, they just leverage the in-batch negative sampling strategy. We argue that this negative sampling strategy will easily select false negatives and inhibit the capability of CL, which we also believe is one of the reasons why a large size of negatives is needed in CL. Apart from using annotated labels, we try to tackle this problem in an unsupervised manner. We propose to integrate expectation maximization (EM) into the selection of negative examples and develop a novel EM-enhanced negative sampling strategy (EMCRL) to distinguish false negatives from true ones for CL performance improvement. Specifically, EMCRL employs EM to estimate the distribution of ground-truth relations between each sample and corresponding in-batch negatives and then optimizes model parameters with the estimations. Considering the sensitivity of EM algorithm to the parameter initialization, we propose to add a random flip into the distribution estimation to enhance the robustness of the learning process. Extensive experiments over several advanced models on sentence representation and image representation tasks demonstrate the effectiveness of EMCRL . Our method is easy to implement, and the code is publicly available at https://github.com/zhangkunzk/EMCRL_pytorch .

EMCRL: EM-Enhanced Negative Sampling Strategy for Contrastive Representation Learning

Debiased Graph Contrastive Learning.

ProGCL: Rethinking Hard Negative Mining in Graph Contrastive Learning

EMCLR: Expectation Maximization Contrastive Learning Representations

Negative Sampling for Contrastive Representation Learning: A Review

Contrastive Attraction and Contrastive Repulsion for Representation Learning

Contrastive Learning with Negative Sampling Correction

EMC$^2$: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence

Debiased Contrastive Learning of Unsupervised Sentence Representations

Self-supervised Graph-level Representation Learning with Adversarial Contrastive Learning

Boosting Graph Contrastive Learning Via Adaptive Sampling

Select The Best: Enhancing Graph Representation with Adaptive Negative Sample Selection

Unsupervised Sentence Representation Via Contrastive Learning with Mixing Negatives

SSLCL: an Efficient Model-Agnostic Supervised Contrastive Learning Framework for Emotion Recognition in Conversations

Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning

Structural self-contrast learning based on adaptive weighted negative samples for facial expression recognition

ConCur: Self-supervised Graph Representation Based on Contrastive Learning with Curriculum Negative Sampling

Exploring Non-Contrastive Representation Learning for Deep Clustering

Contrastive Learning with Synthetic Positives

Self-Damaging Contrastive Learning

A Contrastive Framework to Enhance Unsupervised Sentence Representation Learning