Cross Teaching Between Single-Spectral and Multi-Spectral Detection Transformers for Remote Sensing Object Detection

Jiahe Zhu,Kaiyue Zhou,Huan Zhang,Shengjin Wang,Hongbing Ma
DOI: https://doi.org/10.1109/igarss53475.2024.10641179
IF: 4.715
2024-01-01
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Abstract:Remote sensing platforms are often equipped with sensors of multiple spectrums to capture the diverse reflective properties of ground areas, typically including the visible spectrum and the near infrared (NIR) spectrum. Moreover, thermal infrared (TIR) sensors capture the radiated heat of targets and are capable of all-day observation regardless of illumination conditions. By leveraging the complementary features of different spectrums, multi-spectral fusion techniques enhance the precision and robustness of remote sensing object detection methods. In this article, we present an object detection method for remote sensing imagery named Multi-spectral Detection Transformer (Multi-spectral DETR). The model fuses multi-spectral features with deformable attention and utilizes fused features for object detection. The Multi-spectral Deformable Attention (MDA) fusion block integrates the flexibility of dynamic weights with the principle of fusion based on local regions. Then, we propose a simple yet effective oriented object detection scheme based on angle prediction. Finally, we introduce a novel cross-teaching method between single-spectral and multi-spectral models, which alleviates the spectral interference issue caused by inconsistent target visibility. Experimental results demonstrate that Multi-spectral DETR achieves state-of-the-art results on both the RGB-NIR VEDAI and the RGB-TIR DroneVehicle datasets.
What problem does this paper attempt to address?