Abstract:Recently, numerous methods have achieved impressive performance in remote sensing object detection, relying on convolution or transformer architectures. Such detectors typically have a feature backbone to extract useful features from raw input images. For the remote sensing domain, a common practice among current detectors is to initialize the backbone with pre-training on ImageNet consisting of natural scenes. Fine-tuning the backbone is then typically required to generate features suitable for remote-sensing images. However, this could hinder the extraction of basic visual features in long-term training, thus restricting performance improvement. To mitigate this issue, we propose a novel method named DBF (Dynamic Backbone Freezing) for feature backbone fine-tuning on remote sensing object detection. Our method aims to handle the dilemma of whether the backbone should extract low-level generic features or possess specific knowledge of the remote sensing domain, by introducing a module called 'Freezing Scheduler' to dynamically manage the update of backbone features during training. Extensive experiments on DOTA and DIOR-R show that our approach enables more accurate model learning while substantially reducing computational costs. Our method can be seamlessly adopted without additional effort due to its straightforward design.

What problem does this paper attempt to address?

The paper primarily addresses a new method for fine-tuning the feature backbone in remote sensing object detection tasks. In remote sensing object detection, a pre-trained model is typically used as the backbone network for feature extraction, and this backbone network is further fine-tuned to adapt to specific remote sensing datasets. However, this fine-tuning may hinder the model's ability to extract fundamental visual features from the original images, thereby limiting performance improvement. The proposed method, named DBF (Dynamic Backbone Freezing), aims to solve the above problem by dynamically controlling the update of the feature backbone. Specifically, DBF introduces a module called the "Freezing Scheduler," which alternates between freezing and unfreezing the feature backbone according to a certain strategy during training. This allows the model to retain the general features learned from natural images while acquiring specific knowledge for the remote sensing domain. The benefits of this method include: 1. **Improved model generalization**: By retaining some pre-trained weights, the model can better utilize the general features from natural images. 2. **Reduced computational cost**: By freezing part or all of the backbone network's parameters, memory usage and training time can be significantly reduced. 3. **Enhanced model prediction accuracy**: Experimental results show that this method can improve model accuracy while reducing computational complexity. The authors conducted extensive experiments on two benchmark datasets, DOTA and DIOR-R, to validate the effectiveness of DBF. The results indicate that compared to traditional methods of fully fine-tuning or fully freezing the feature backbone, DBF not only improves the model's prediction accuracy but also significantly reduces the demand for computational resources. In summary, the paper attempts to address the issue of how to effectively balance the generality and domain specificity of the feature backbone in remote sensing object detection tasks to improve model performance and efficiency.

Rethinking Feature Backbone Fine-tuning for Remote Sensing Object Detection

Deep Convolutional Feature Enhancement for Remote Sensing Object Detection

Remote Sensing Object Detection Based on Receptive Field Expansion Block

Oriented Object Detection Based on Foreground Feature Enhancement in Remote Sensing Images.

Frequency Spectrum Features Modeling for Real-Time Tiny Object Detection in Remote Sensing Image

Object Detection in Aerial Remote Sensing Images with Multi-scale Feature Enhancement

Learning Critical Features for Arbitrary-Oriented Object Detection in Remote-Sensing Optical Images

Training Domain-invariant Object Detector Faster with Feature Replay and Slow Learner

Object Detection for Remote Sensing Based on the Enhanced YOLOv8 With WBiFPN

Fine-Grained Feature Enhancement for Object Detection in Remote Sensing Images

CoF-Net: A Progressive Coarse-to-Fine Framework for Object Detection in Remote-Sensing Imagery

A Self-Supplementary and Revised Network for Remote Sensing Object Detection

An Effective and Lightweight Hybrid Network for Object Detection in Remote Sensing Images

MutDet: Mutually Optimizing Pre-training for Remote Sensing Object Detection

Retentive Compensation and Personality Filtering for Few-Shot Remote Sensing Object Detection

Focus on Complex Background and Multi-Scale Remote Sensing Images Objects Detection

Adaptive Knowledge Distillation for Lightweight Remote Sensing Object Detectors Optimizing

Lightweight Remote Sensing Object Detection Algorithm for Spaceborne Edge Computation

Frozen-DETR: Enhancing DETR with Image Understanding from Frozen Foundation Models

An Improved DETR Based on Angle Denoising and Oriented Boxes Refinement for Remote Sensing Object Detection

Single-Stage Detector With Dual Feature Alignment for Remote Sensing Object Detection