LCAS-DetNet: A Ship Target Detection Network for Synthetic Aperture Radar Images

Junlin Liu,Dingyi Liao,Xianyao Wang,Jun Li,Bing Yang,Guanyu Chen
DOI: https://doi.org/10.3390/app14125322
2024-01-01
Applied Sciences
Abstract:Monitoring ships on water surfaces encounters obstacles such as weather conditions, sunlight, and water ripples, posing significant challenges in accurately detecting target ships in real time. Synthetic Aperture Radar (SAR) offers a viable solution for real-time ship detection, unaffected by cloud coverage, precipitation, or light levels. However, SAR images are often affected by speckle noise, salt-and-pepper noise, and water surface ripple interference. This study introduces LCAS-DetNet, a Multi-Location Cross-Attention Ship Detection Network tailored for the ships in SAR images. Modeled on the YOLO architecture, LCAS-DetNet comprises a feature extractor, an intermediate layer (“Neck”), and a detection head. The feature extractor includes the computation of Multi-Location Cross-Attention (MLCA) for precise extraction of ship features at multiple scales. Incorporating both local and global branches, MLCA bolsters the network’s ability to discern spatial arrangements and identify targets via a cross-attention mechanism. Each branch utilizes Multi-Location Attention (MLA) and calculates pixel-level correlations in both channel and spatial dimensions, further combating the impact of salt-and-pepper noise on the distribution of objective ship pixels. The feature extractor integrates downsampling and MLCA stacking, enhanced with residual connections and Patch Embedding, to improve the network’s multi-scale spatial recognition capabilities. As the network deepens, we consider this structure to be cascaded and multi-scale, providing the network with a richer receptive field. Additionally, we introduce a loss function based on Wise-IoUv3 to address the influence of label quality on the gradient updates. The effectiveness of our network was validated on the HRSID and SSDD datasets, where it achieved state-of-the-art performance: a 96.59% precision on HRSID and 97.52% on SSDD.
What problem does this paper attempt to address?