Abstract:Self-supervised learning methods (SSL) have achieved significant success via maximizing the mutual information between two augmented views, where cropping is a popular augmentation technique. Cropped regions are widely used to construct positive pairs, while the remained regions after cropping have rarely been explored in existing methods, although they together constitute the same image instance and both contribute to the description of the category. In this paper, we make the first attempt to demonstrate the importance of both regions in cropping from a complete perspective and the effectiveness of using both regions via designing a simple yet effective pretext task called Region Contrastive Learning (RegionCL). Technically, to construct the two kinds of regions, we randomly crop a region (called the paste view) from each input image with the same size and swap them between different images to compose new images together with the remained regions (called the canvas view). Then, instead of taking the new images as a whole for positive or negative samples, contrastive pairs are efficiently constructed from the regional perspective based on the following simple criteria, i.e., each view is (1) positive with views augmented from the same original image and (2) negative with views augmented from other images. With minor modifications to popular SSL methods, RegionCL exploits those abundant pairs and helps the model distinguish the regions features from both canvas and paste views, therefore learning better visual representations. Experiments on ImageNet, MS COCO, and Cityscapes demonstrate that RegionCL improves MoCov2, DenseCL, and SimSiam by large margins and achieves state-of-the-art performance on classification, detection, and segmentation tasks. The code is publicly available at https://github.com/Annbless/RegionCL.

SemanticCrop: Boosting Contrastive Learning Via Semantic-Cropped Views.

Crafting Better Contrastive Views for Siamese Representation Learning

Seed the Views: Hierarchical Semantic Alignment for Contrastive Representation Learning

RegionCL: Can Simple Region Swapping Contribute to Contrastive Learning?

ParamCrop: Parametric Cubic Cropping for Video Contrastive Learning

Cross-Image Pixel Contrasting for Semantic Segmentation

Enhancing Contrastive Learning Inspired by the Philosophy of "The Blind Men and the Elephant"

Hierarchical Semantic Aggregation for Contrastive Representation Learning.

Semantic Image Cropping

Saliency Guided Contrastive Learning on Scene Images

RegionCL: Exploring Contrastive Region Pairs for Self-supervised Representation Learning

RepCo: Replenish Sample Views with Better Consistency for Contrastive Learning

Semantic-Enhanced Supervised Contrastive Learning.

Enhancing Contrastive Learning with Efficient Combinatorial Positive Pairing

Exploring Cross-Image Pixel Contrast for Semantic Segmentation

Dense Semantic Contrast for Self-Supervised Visual Representation Learning

Region-aware Contrastive Learning for Semantic Segmentation

Saliency Aware Image Cropping with Latent Region Pair.

A Semantic Segmentation Algorithm Based on Contrastive Learning Using Aligned Feature Samples.

Semantic Positive Pairs for Enhancing Visual Representation Learning of Instance Discrimination methods