Abstract:In recent years, significant strides have been made in the field of synthetic aperture radar (SAR) ship detection through the application of deep learning techniques. These advanced methods have substantially improved the accuracy of ship detection. Nonetheless, SAR images present distinct challenges, including complex backgrounds, small ship targets, and noise interference, thereby rendering the detectors particularly demanding. In this paper, we introduce LRTransDet, a real-time SAR ship detector. LRTransDet leverages a lightweight vision transformer (ViT) and a multi-scale feature fusion neck to address these challenges effectively. First, our model implements a lightweight backbone that combines convolutional neural networks (CNNs) and transformers, thus enabling it to simultaneously capture both local and global features from input SAR images. Moreover, we boost the model's efficiency by incorporating the faster weighted feature fusion (Faster-WF2) module and coordinate attention (CA) mechanism within the feature fusion neck. These components optimize computational resources while maintaining the model's performance. To overcome the challenge of detecting small ship targets in SAR images, we refine the original loss function and use the normalized Wasserstein distance (NWD) metric and the intersection over union (IoU) scheme. This combination improves the detector's ability to efficiently detect small targets. To prove the performance of our proposed model, we conducted experiments on four challenging datasets (the SSDD, the SAR-Ship Dataset, the HRSID, and the LS-SSDD-v1.0). The results demonstrate that our model surpasses both general object detectors and state-of-the-art SAR ship detectors in terms of detection accuracy (97.8% on the SSDD and 93.9% on the HRSID) and speed (74.6 FPS on the SSDD and 75.8 FPS on the HRSID), all while demanding 3.07 M parameters. Additionally, we conducted a series of ablation experiments to illustrate the impact of the EfficientViT, the Faster-WF2 module, the CA mechanism, and the NWD metric on multi-scale feature fusion and detection performance.

Ship object detection in one-stage framework based on Swin-Transformer.

SwinSeg: Swin transformer and MLP hybrid network for ship segmentation in maritime surveillance system

LPST-Det: Local-Perception-Enhanced Swin Transformer for SAR Ship Detection

Ship Detection in SAR Images Based on Feature Enhancement Swin Transformer and Adjacent Feature Fusion

Lightweight ship detection method based on Swin-YOLOFormer

High Performance Ship Detection via Transformer and Feature Distillation

EfficientShip: A Hybrid Deep Learning Framework for Ship Detection in the River

SAR-ShipSwin: enhancing SAR ship detection with robustness in complex environment

YOLOv5s maritime distress target detection method based on swin transformer

ST-YOLOA: a Swin-transformer-based YOLO model with an attention mechanism for SAR ship detection under complex background

NSD-SSD: A Novel Real-Time Ship Detector Based on Convolutional Neural Network in Surveillance Video

Sar Ship Detection based on Swin Transformer and Feature Enhancement Feature Pyramid Network

A Lightweight Algorithm for Ship Object Detection in Complex Marine Environments

Ship Detection Based on YOLO Algorithm for Visible Images

Object Detection Based on Swin Deformable Transformer-BiPAFPN-YOLOX

YOLO-SD: Small Ship Detection in SAR Images by Multi-Scale Convolution and Feature Transformer Module

LRTransDet: A Real-Time SAR Ship-Detection Network with Lightweight ViT and Multi-Scale Feature Fusion

Lightweight Single-Stage Ship Object Detection Algorithm for Unmanned Surface Vessels Based on Improved YOLOv5

YOLOSeaShip: a lightweight model for real-time ship detection

Ship detection and identification in SDGSAT-1 glimmer images based on the glimmer YOLO model

Underwater Target Detection Algorithm Based on YOLO and Swin Transformer for Sonar Images