Abstract:Unsupervised domain adaptation (UDA) is critical for remote sensing object detection in real applications, aiming to address the significant performance degradation issue caused by the domain gap between the source and target domain. This method achieves cross-domain alignment by leveraging the unlabeled target domain data, thus avoiding the expensive annotation cost. However, existing works mainly cope with convolutional neural network (CNN)-based object detectors, which are characterized by complex adversarial learning architecture and fail to accurately align the features in remote sensing images with sparsely allocated objects and inevitable background noise. Compared to CNN-based methods, the detection transformer (DETR) largely simplifies the object detection pipeline and demonstrates the great potential of its intrinsic characteristics of global relation modeling between any pixels. On this basis, we propose the first strong DETR-based baseline, remote sensing teacher, for UDA in remote sensing object detection. Specifically, the remote sensing teacher introduces an innovative learnable frequency-enhanced feature alignment (LFA) module. Within this module, we initially transform the features into frequency space to simplify the attention solver and effectively capture domain-specific information. Subsequently, the module significantly enhances the global feature representations of sparsely allocated objects by using a lightweight attention mechanism. Following this, the module incorporates learnable filters with a gated mechanism, enabling selective alignment of features in noisy backgrounds. In addition, the remote sensing teacher employs a self-adaptive pseudo-label assigner (SPA) that can automatically adjust the class-wise confidence threshold according to the model’s learning status, thereby enabling the generation of high-quality pseudo-labels in scenarios with a long-tailed distribution. Leveraging these pseudo-labels further mitigates the domain bias of the detector by establishing alignment at the label level. Extensive experimental results demonstrate the superior performance and generalization capabilities of our proposed remote sensing teacher in multiple remote sensing adaptation scenarios. The Code is released at https://github.com/h751410234/RemoteSensingTeacher.

Multi-adversarial Faster-RCNN with Paradigm Teacher for Unrestricted Object Detection

Multi-View Domain Adaptive Object Detection on Camera Networks.

Multi-Adversarial Faster-RCNN for Unrestricted Object Detection

Joint Feature-Level And Pixel-Level Domain Adaption For Object Detection In The Wild

Domain Adaptive Object Detection via Asymmetric Tri-way Faster-RCNN

Partial Alignment for Object Detection in the Wild

Masked Retraining Teacher-Student Framework for Domain Adaptive Object Detection

Cross-Domain Adaptive Teacher for Object Detection

Multi-Source Domain Adaptation for Object Detection with Prototype-based Mean-teacher

Contrastive Mean Teacher for Domain Adaptive Object Detectors

Cross Domain Object Detection via Multi-Granularity Confidence Alignment based Mean Teacher

CMT: Co-training Mean-Teacher for Unsupervised Domain Adaptation on 3D Object Detection

Source-free domain adaptive object detection based on pseudo-supervised mean teacher

Remote Sensing Teacher: Cross-Domain Detection Transformer with Learnable Frequency-Enhanced Feature Alignment in Remote Sensing Imagery

Exploring Object Relation in Mean Teacher for Cross-Domain Detection

Cross-Domain Object Detection through Consistent and Contrastive Teacher with Fourier Transform

Reliable hybrid knowledge distillation for multi-source domain adaptive object detection

Versatile Teacher: A class-aware teacher–student framework for cross-domain adaptation

Deeply Aligned Adaptation for Cross-domain Object Detection

Style-Guided Adversarial Teacher for Cross-Domain Object Detection

Versatile Teacher: A Class-aware Teacher-student Framework for Cross-domain Adaptation