SAR-CDSS: A Semi-Supervised Cross-Domain Object Detection from Optical to SAR Domain

Cheng Luo,Yueting Zhang,Jiayi Guo,Yuxin Hu,Guangyao Zhou,Hongjian You,Xia Ning
DOI: https://doi.org/10.3390/rs16060940
IF: 5
2024-03-08
Remote Sensing
Abstract:The unique imaging modality of synthetic aperture radar (SAR) has posed significant challenges for object detection, making it more complex to acquire and interpret than optical images. Recently, numerous studies have proposed cross-domain adaptive methods based on convolutional neural networks (CNNs) to promote SAR object detection using optical data. However, existing cross-domain methods focus on image features, lack improvement on input data, and ignore the valuable supervision provided by few labeled SAR images. Therefore, we propose a semi-supervised cross-domain object detection framework that uses optical data and few SAR data to achieve knowledge transfer for SAR object detection. Our method focuses on the data processing aspects to gradually reduce the domain shift at the image, instance, and feature levels. First, we propose a data augmentation method of image mixing and instance swapping to generate a mixed domain that is more similar to the SAR domain. This method fully utilizes few SAR annotation information to reduce domain shift at image and instance levels. Second, at the feature level, we propose an adaptive optimization strategy to filter out mixed domain samples that significantly deviate from the SAR feature distribution to train feature extractor. In addition, we employ Vision Transformer (ViT) as feature extractor to handle the global feature extraction of mixed images. We propose a detection head based on normalized Wasserstein distance (NWD) to enhance objects with smaller effective regions in SAR images. The effectiveness of our proposed method is evaluated on public SAR ship and oil tank datasets.
environmental sciences,imaging science & photographic technology,remote sensing,geosciences, multidisciplinary
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to utilize a small amount of labeled Synthetic Aperture Radar (SAR) data and a large amount of labeled optical data to achieve knowledge transfer in cross - domain object detection from optical images to SAR images, so as to improve the object detection performance in SAR images. Specifically, the paper points out that there are two main problems in current cross - domain methods: 1. **Ignoring the Supervisory Role of a Small Amount of Labeled SAR Images on the Network**: In practical applications, the target domain (SAR images) usually has only a small amount of labeled data, and these data are very important for model training, but existing methods often overlook this point. 2. **Over - focusing on Feature Adversarial Alignment while Ignoring Data Fusion Alignment between the Two Domains**: Existing cross - domain methods mainly focus on feature - level alignment, while ignoring image - level and instance - level alignment. To address these problems, the authors propose a semi - supervised cross - domain object detection framework (SAR - CDSS). This framework optimizes the detector through the following three main steps to achieve knowledge transfer from the fully - labeled optical domain to the SAR domain with only a small number of labeled samples: 1. **Data Augmentation: Domain Mix**: - **Image - level Augmentation**: Randomly mix the images of the source domain (optical images) and the target domain (SAR images) to generate mixed images that are closer to the distribution of the SAR domain. - **Instance - level Augmentation**: Separate a limited number of instance annotations from the background and randomly place them in other images to make full use of the limited instance annotations and align the instance - level features of the two domains. 2. **Global Feature Extractor: Vision Transformer (ViT)**: - Use ViT as a feature extractor to better extract the features of the mixed images through its global receptive field, thereby enhancing the feature extraction process. 3. **Adaptive Optimization Strategy**: - Use the metric learning method to identify and filter feature samples that are more conducive to knowledge transfer, instead of blindly increasing the amount of data. In this way, not only the quality of the selected samples is optimized, but also the over - fitting problem is effectively prevented. In addition, in order to address the problem that the effective energy area of small targets in SAR images is small and there is less detailed information, the authors propose to use the normalized Wasserstein distance (NWD) instead of the traditional Intersection over Union (IoU) metric to improve the detection accuracy in complex scenarios. In summary, the main contribution of this paper is to propose a new semi - supervised cross - domain object detection framework, which effectively improves the object detection performance in SAR images through multi - level data processing and feature extraction methods.