An Improved DETR Based on Angle Denoising and Oriented Boxes Refinement for Remote Sensing Object Detection

Hongmei Wang,Chenkai Li,Qiaorong Wu,Jingyu Wang
DOI: https://doi.org/10.3390/rs16234420
IF: 5
2024-11-29
Remote Sensing
Abstract:Remote sensing image object detection presents significant challenges, due to the difficulty in accurately predicting the rotational angles of ground-oriented objects, coupled with issues such as the false and missed detection caused by insufficient object information. Moreover, traditional convolutional neural networks are inherently limited in their capacity to capture global contextual information. To address these challenges, a DETR-based remote sensing image object detection model is designed for oriented objects. Except for the backbone, transformer encoders and decoders, scenario query guiding modules, oriented boxes refinement modules, auxiliary multiple detectors, and oriented boxes denoising modules are also designed and included in our network. The scenario query guiding module is proposed that implicitly guides the decoder to focus more on object classification information specific to that scene when inferring. The multiple deformable attention mechanism is improved to the oriented one and utilized into the oriented boxes refinement module which repeatedly corrects the oriented boxes, enhancing the network's ability to predict the oriented boxes precisely. At the same time, the improved auxiliary multiple detectors and oriented boxes denoising module are applied only for the training process to enhance the learning ability of the encoder and decoder for oriented objects. The ablation experiments proved the effectiveness of the designed modules. The detection accuracy of our network on DOTAv1.0 (76.77%) and HRCS2016 (97.01%) is improved compared with the state-of-the-art methods, which are especially significantly higher than DETR detection algorithms.
environmental sciences,imaging science & photographic technology,remote sensing,geosciences, multidisciplinary
What problem does this paper attempt to address?