Abstract:Contrastive learning allows us to learn general features for downstream tasks without the need for labeled data by leveraging intrinsic signals within remote sensing images. Existing contrastive learning methods encourage invariant feature learning by bringing positive samples defined by random transformations in feature spaces closer, where transformed samples of the same image at different intensities are considered equivalent. However, remote sensing images differ from natural images in their top-down perspective results in the arbitrary orientation of objects and in that the images contain rich in-plane rotation information. Maintaining invariance to rotation transformations can lead to the loss of rotation information in features, thereby affecting angle information predictions for differently rotated samples in downstream tasks. Therefore, we believe that contrastive learning should not focus only on strict invariance but encourage features to be equivariant to rotation while maintaining invariance to other transformations. To achieve this goal, we propose an invariant–equivariant covariant network (Co-ECL) based on collaborative and reverse mechanisms. The collaborative mechanism encourages rotation equivariance by predicting the rotation transformations of input images and combines invariant and equivariant learning tasks to jointly supervise the feature learning process to achieve collaborative learning. The reverse mechanism introduces a reverse rotation module in the feature learning stage, applying reverse rotation transformations with equal intensity to features in invariant learning tasks as in the data transformation stage, thereby ensuring their independent realization. In experiments conducted on three publicly available oriented object detection datasets of remote sensing images, our method consistently demonstrated the best performance. Additionally, these experiments on multi-angle datasets demonstrated that our method has good robustness on rotation-related tasks.

Co-ECL: Covariant Network with Equivariant Contrastive Learning for Oriented Object Detection in Remote Sensing Images

Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images

Rotation-Invariant and Relation-Aware Cross-Domain Adaptation Object Detection Network for Optical Remote Sensing Images

Learning Critical Features for Arbitrary-Oriented Object Detection in Remote-Sensing Optical Images

EFECL: Feature encoding enhancement with contrastive learning for indoor 3D object detection

Rotationally Equivariant 3D Object Detection

CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote-Sensing Images

A Novel Contrastive Learning Model for Aerial Images

Dual Contrastive Network for Few-Shot Remote Sensing Image Scene Classification

Proxy-Based Rotation Invariant Deep Metric Learning for Remote Sensing Image Retrieval

Oriented Object Detection for Remote Sensing Images via Object-Wise Rotation-Invariant Semantic Representation

Learning Oriented Object Detection via Naive Geometric Computing

Learning an Invariant and Equivariant Network for Weakly Supervised Object Detection

Occluded Scene Classification via Cascade Supervised Contrastive Learning

Localization, balance and affinity: a stronger multifaceted collaborative salient object detector in remote sensing images

Multi-task contrastive learning for change detection in remote sensing images

TCD: Task-Collaborated Detector for Oriented Objects in Remote Sensing Images

Contrastive Learning Via Equivariant Representation

FRED: Towards a Full Rotation-Equivariance in Aerial Image Object Detection

Learning Orientation-Estimation Convolutional Neural Network for Building Detection in Optical Remote Sensing Image

Equivariant Spatio-Temporal Self-Supervision for LiDAR Object Detection