A real-time object detection method for underwater complex environments based on FasterNet-YOLOv7
Qing Yang,Huijuan Meng,Yuchen Gao,Dexin Gao
DOI: https://doi.org/10.1007/s11554-023-01387-4
IF: 2.293
2023-12-14
Journal of Real-Time Image Processing
Abstract:A FasterNet-You Only Look Once (YOLO)v7 algorithm is proposed for underwater complex environments with blurred images and complex backgrounds, which lead to difficulties in object target feature extraction and target miss detection, and to improve the fusion capability and real-time detection of small underwater targets. Before training the improved model, the original images acquired by the underwater robot are preprocessed in combination with the Underwater Image Enhancement Convolutional Neural Network (UWCNN) algorithm, which helps to identify targets accurately in the complex marine environment. First, to extract spatial features more efficiently, the algorithm uses Faster Neural Networks (FasterNet-L) as the backbone network model as well as an improved loss function, Focal Efficient Intersection over Union Loss (Focal-EIOU Loss), to reduce redundant computations and memory access, and the regression process focuses on high-quality anchor frames. Second, for the problem of poor robustness of small targets in an underwater environment, the algorithm uses the Cross-modal Transformer Attention (CoTAttention) lightweight attention mechanism to improve the original algorithm so that the detection targets are enhanced in channel and spatial dimensions. Finally, the experimental results show that the mean average precision ( mAP ) value of this paper's algorithm reaches 91.8% and the actual detection video frame rate reaches 83.21. FasterNet-YOLOv7 has higher detection accuracy compared with Faster Region-Based Convolutional Neural Network (Faster RCNN), Single Shot MultiBox Detection (SSD), YOLOv4, YOLOv5, and YOLOv7 models and is more accurate.
computer science, artificial intelligence,engineering, electrical & electronic,imaging science & photographic technology