Multitask Learning for SAR Ship Detection with Gaussian-Mask Joint Segmentation

Ming Zhao,Xin Zhang,André Kaup
DOI: https://doi.org/10.1109/TGRS.2023.3304847
2024-11-21
Abstract:Detecting ships in synthetic aperture radar (SAR) images is challenging due to strong speckle noise, complex surroundings, and varying scales. This paper proposes MLDet, a multitask learning framework for SAR ship detection, consisting of object detection, speckle suppression, and target segmentation tasks. An angle classification loss with aspect ratio weighting is introduced to improve detection accuracy by addressing angular periodicity and object proportions. The speckle suppression task uses a dual-feature fusion attention mechanism to reduce noise and fuse shallow and denoising features, enhancing robustness. The target segmentation task, leveraging a rotated Gaussian-mask, aids the network in extracting target regions from cluttered backgrounds and improves detection efficiency with pixel-level predictions. The Gaussian-mask ensures ship centers have the highest probabilities, gradually decreasing outward under a Gaussian distribution. Additionally, a weighted rotated boxes fusion (WRBF) strategy combines multi-direction anchor predictions, filtering anchors beyond boundaries or with high overlap but low confidence. Extensive experiments on SSDD+ and HRSID datasets demonstrate the effectiveness and superiority of MLDet.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to detect ships in Synthetic Aperture Radar (SAR) images more accurately and efficiently. Specifically, due to the particularity of its imaging mechanism, SAR images have problems such as strong speckle noise, complex backgrounds, and diversity of target scales. These problems make it difficult for traditional ship - detection methods to achieve ideal results. Therefore, this paper proposes a multi - task learning framework (MLDet) to address these challenges. ### Main problems: 1. **Strong speckle noise**: Speckle noise in SAR images will interfere with the learning of low - level features and affect the extraction of high - level semantic features. 2. **Complex background**: Complex background information (such as sea clutter, islands, land, etc.) will lead to false detections and missed detections. 3. **Diversity of target scales**: The scale of ships in SAR images varies greatly, and especially the detection of small targets is more difficult. 4. **Periodicity problems of angles and aspect ratios**: Traditional methods have limitations when dealing with targets with specific angles and aspect ratios. ### Solutions: To overcome the above problems, this paper proposes a multi - task learning framework MLDet, which mainly includes the following aspects: 1. **Object detection module**: The angle classification loss and aspect - ratio weighting (ARW) are introduced to make the detector more sensitive to angles and aspect ratios, thereby improving the detection accuracy. \[ L_{\theta}(\theta, \hat{\theta})=\left|\sin \left(\alpha(\theta - \hat{\theta})\right)\right| \times L_{\text {smooth }}^{1}(\theta, \hat{\theta}) \] where: \[ \alpha = \begin{cases} 1, & \text { if }(h / w)>r \\ 2, & \text { otherwise } \end{cases} \] \(h\) and \(w\) are the long side and short side of the real box respectively, and \(r\) is the aspect - ratio threshold, which is set to 1.5. 2. **Denoising feature fusion module**: A dual - feature fusion attention mechanism (DFF) is designed. By fusing shallow - level features and denoising features, the interference of complex backgrounds is reduced, and the saliency of target features is enhanced. 3. **Object segmentation module**: The rotated Gaussian - mask is used for object segmentation to ensure that the ship center has the highest confidence, and other regions gradually decay according to the Gaussian distribution, so as to better extract the target area. \[ g(i, j)=\exp \left(-\left(\frac{\lambda_{w}(i_{h}-x)^{2}}{2 w^{2}}+\frac{\lambda_{h}(j_{h}-y)^{2}}{2 h^{2}}\right)\right) \] where \((x, y)\) is the ship center coordinates, \(h\) and \(w\) are the height and width of the ship, \(\theta\) is the rotation angle, and \(\lambda_{w}\) and \(\lambda_{h}\) are covariance control factors. 4. **Weighted rotated box fusion strategy (WRBF)**: Combine the multi - directional anchor point prediction results, eliminate the anchor points outside the boundary and the anchor points with high overlap rate but low scores, and improve the generalization ability of detection. Through these improvements, MLDet shows higher accuracy and robustness in SAR ship detection.