Abstract:Unsupervised feature learning has made great strides with contrastive learning based on instance discrimination and invariant mapping, as benchmarked on curated class-balanced datasets. However, natural data could be highly correlated and long-tail distributed. Natural between-instance similarity conflicts with the presumed instance distinction, causing unstable training and poor performance. Our idea is to discover and integrate between-instance similarity into contrastive learning, not directly by instance grouping, but by cross-level discrimination (CLD) between instances and local instance groups. While invariant mapping of each instance is imposed by attraction within its augmented views, between-instance similarity could emerge from common repulsion against instance groups. Our batch-wise and cross-view comparisons also greatly improve the positive/negative sample ratio of contrastive learning and achieve better invariant mapping. To effect both grouping and discrimination objectives, we impose them on features separately derived from a shared representation. In addition, we propose normalized projection heads and unsupervised hyper-parameter tuning for the first time. Our extensive experimentation demonstrates that CLD is a lean and powerful add-on to existing methods such as NPID, MoCo, InfoMin, and BYOL on highly correlated, long-tail, or balanced datasets. It not only achieves new state-of-the-art on self-supervision, semi-supervision, and transfer learning benchmarks, but also beats MoCo v2 and SimCLR on every reported performance attained with a much larger compute. CLD effectively brings unsupervised learning closer to natural data and real-world applications. Our code is publicly available at: <a class="link-external link-https" href="https://github.com/frank-xwang/CLD-UnsupervisedLearning" rel="external noopener nofollow">this https URL</a>.

Space-correlated Contrastive Representation Learning with Multiple Instances.

Point Contrastive Prediction with Semantic Clustering for Self-Supervised Learning on Point Cloud Videos

RegionCL: Exploring Contrastive Region Pairs for Self-supervised Representation Learning

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Saliency Guided Contrastive Learning on Scene Images

RegionCL: Can Simple Region Swapping Contribute to Contrastive Learning?

DenseCL: A Simple Framework for Self-Supervised Dense Visual Pre-Training

Spatial and Semantic Consistency Contrastive Learning for Self-Supervised Semantic Segmentation of Remote Sensing Images

Contrastive Attraction and Contrastive Repulsion for Representation Learning

Unsupervised Feature Learning by Cross-Level Instance-Group Discrimination

HAPiCLR: heuristic attention pixel-level contrastive loss representation learning for self-supervised pretraining

Center-wise Local Image Mixture for Contrastive Representation Learning.

Unified Contrastive Learning in Image-Text-Label Space

Multi-Level Contrastive Learning for Dense Prediction Task

Learning multi-view visual correspondences with self-supervision

Object-aware Contrastive Learning for Debiased Scene Representation

Contrastive Learning with Synthetic Positives

Hyperbolic Contrastive Learning for Visual Representations beyond Objects

Multi-Grained Contrast for Data-Efficient Unsupervised Representation Learning

Adaptive Multi-head Contrastive Learning

Region-aware Contrastive Learning for Semantic Segmentation