Rethinking Feature Backbone Fine-tuning for Remote Sensing Object Detection

Yechan Kim,JongHyun Park,SooYeon Kim,Moongu Jeon
2024-08-08
Abstract:Recently, numerous methods have achieved impressive performance in remote sensing object detection, relying on convolution or transformer architectures. Such detectors typically have a feature backbone to extract useful features from raw input images. For the remote sensing domain, a common practice among current detectors is to initialize the backbone with pre-training on ImageNet consisting of natural scenes. Fine-tuning the backbone is then typically required to generate features suitable for remote-sensing images. However, this could hinder the extraction of basic visual features in long-term training, thus restricting performance improvement. To mitigate this issue, we propose a novel method named DBF (Dynamic Backbone Freezing) for feature backbone fine-tuning on remote sensing object detection. Our method aims to handle the dilemma of whether the backbone should extract low-level generic features or possess specific knowledge of the remote sensing domain, by introducing a module called 'Freezing Scheduler' to dynamically manage the update of backbone features during training. Extensive experiments on DOTA and DIOR-R show that our approach enables more accurate model learning while substantially reducing computational costs. Our method can be seamlessly adopted without additional effort due to its straightforward design.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
The paper primarily addresses a new method for fine-tuning the feature backbone in remote sensing object detection tasks. In remote sensing object detection, a pre-trained model is typically used as the backbone network for feature extraction, and this backbone network is further fine-tuned to adapt to specific remote sensing datasets. However, this fine-tuning may hinder the model's ability to extract fundamental visual features from the original images, thereby limiting performance improvement. The proposed method, named DBF (Dynamic Backbone Freezing), aims to solve the above problem by dynamically controlling the update of the feature backbone. Specifically, DBF introduces a module called the "Freezing Scheduler," which alternates between freezing and unfreezing the feature backbone according to a certain strategy during training. This allows the model to retain the general features learned from natural images while acquiring specific knowledge for the remote sensing domain. The benefits of this method include: 1. **Improved model generalization**: By retaining some pre-trained weights, the model can better utilize the general features from natural images. 2. **Reduced computational cost**: By freezing part or all of the backbone network's parameters, memory usage and training time can be significantly reduced. 3. **Enhanced model prediction accuracy**: Experimental results show that this method can improve model accuracy while reducing computational complexity. The authors conducted extensive experiments on two benchmark datasets, DOTA and DIOR-R, to validate the effectiveness of DBF. The results indicate that compared to traditional methods of fully fine-tuning or fully freezing the feature backbone, DBF not only improves the model's prediction accuracy but also significantly reduces the demand for computational resources. In summary, the paper attempts to address the issue of how to effectively balance the generality and domain specificity of the feature backbone in remote sensing object detection tasks to improve model performance and efficiency.