SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection

Yuxuan Li,Xiang Li,Weijie Li,Qibin Hou,Li Liu,Ming-Ming Cheng,Jian Yang
DOI: https://doi.org/10.48550/arXiv.2403.06534
2024-03-11
Computer Vision and Pattern Recognition
Abstract:Synthetic Aperture Radar (SAR) object detection has gained significant attention recently due to its irreplaceable all-weather imaging capabilities. However, this research field suffers from both limited public datasets (mostly comprising <2K images with only mono-category objects) and inaccessible source code. To tackle these challenges, we establish a new benchmark dataset and an open-source method for large-scale SAR object detection. Our dataset, SARDet-100K, is a result of intense surveying, collecting, and standardizing 10 existing SAR detection datasets, providing a large-scale and diverse dataset for research purposes. To the best of our knowledge, SARDet-100K is the first COCO-level large-scale multi-class SAR object detection dataset ever created. With this high-quality dataset, we conducted comprehensive experiments and uncovered a crucial challenge in SAR object detection: the substantial disparities between the pretraining on RGB datasets and finetuning on SAR datasets in terms of both data domain and model structure. To bridge these gaps, we propose a novel Multi-Stage with Filter Augmentation (MSFA) pretraining framework that tackles the problems from the perspective of data input, domain transition, and model migration. The proposed MSFA method significantly enhances the performance of SAR object detection models while demonstrating exceptional generalizability and flexibility across diverse models. This work aims to pave the way for further advancements in SAR object detection. The dataset and code is available at https://github.com/zcablii/SARDet_100K.
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper primarily addresses two core issues in the field of Synthetic Aperture Radar (SAR) target detection: 1. **Limitations of Datasets and Code**: Most publicly available SAR target detection datasets are small in scale and have limited categories (usually containing fewer than 2K images and covering only a single category of objects), which restricts researchers' ability to evaluate different methods. Additionally, the source code for these datasets is often not accessible, making it difficult to reproduce and compare different research results. 2. **Gap Between Pre-training and Fine-tuning**: Models pre-trained on natural RGB datasets (such as ImageNet) encounter significant data domain and model structure gaps when transferred to SAR datasets for fine-tuning. The paper proposes a new Multi-Stage Filtering Augmentation (MSFA) pre-training framework to bridge these gaps, thereby improving the performance of SAR target detection models. By creating a large-scale, diverse benchmark dataset SARDet-100K and proposing the MSFA pre-training framework, this work aims to advance the field of SAR target detection and lay the foundation for subsequent research.