Abstract:Underwater visual detection technology is crucial for marine exploration and monitoring. Given the growing demand for accurate underwater target recognition, this study introduces an innovative architecture, YOLOv8-MU, which significantly enhances the detection accuracy. This model incorporates the large kernel block (LarK block) from UniRepLKNet to optimize the backbone network, achieving a broader receptive field without increasing the model's depth. Additionally, the integration of C2fSTR, which combines the Swin transformer with the C2f module, and the SPPFCSPC_EMA module, which blends Cross-Stage Partial Fast Spatial Pyramid Pooling (SPPFCSPC) with attention mechanisms, notably improves the detection accuracy and robustness for various biological targets. A fusion block from DAMO-YOLO further enhances the multi-scale feature extraction capabilities in the model's neck. Moreover, the adoption of the MPDIoU loss function, designed around the vertex distance, effectively addresses the challenges of localization accuracy and boundary clarity in underwater organism detection. The experimental results on the URPC2019 dataset indicate that YOLOv8-MU achieves an mAP@0.5 of 78.4%, showing an improvement of 4.0% over the original YOLOv8 model. Additionally, on the URPC2020 dataset, it achieves 80.9%, and, on the Aquarium dataset, it reaches 75.5%, surpassing other models, including YOLOv5 and YOLOv8n, thus confirming the wide applicability and generalization capabilities of our proposed improved model architecture. Furthermore, an evaluation on the improved URPC2019 dataset demonstrates leading performance (SOTA), with an mAP@0.5 of 88.1%, further verifying its superiority on this dataset. These results highlight the model's broad applicability and generalization capabilities across various underwater datasets.

What problem does this paper attempt to address?

The paper primarily addresses the challenges faced by underwater target detection technology and proposes an improved YOLOv8 model (named YOLOv8-MU) aimed at enhancing the accuracy and robustness of target detection in underwater environments. Specifically, the paper addresses the following key issues: 1. **Unstable lighting conditions**: Light absorption and scattering in underwater environments lead to reduced image contrast, affecting the distinction between targets and background. 2. **Poor image quality**: Factors such as water flow, suspended particles, and bubbles cause image blurring and distortion, further reducing recognition accuracy. 3. **Target diversity**: The wide variety of underwater organisms, with different shapes and sizes, increases the complexity of the detection task. 4. **Noise interference**: External factors such as waves, bubbles, and floating objects interfere with the detection and recognition process. To overcome the above challenges, the authors propose the following key technical improvements: - **Adoption of LarK block**: Introduced from UniRepLKNet, it optimizes the backbone network to achieve a wider receptive field without increasing the model depth. - **Integration of C2fSTR module**: Combining Swin Transformer with the C2f module enhances the fusion of information at different scales, improving the detection accuracy and robustness for various biological targets. - **Incorporation of SPPFCSPC_EMA module**: Combining Cross-Stage Partial Fast Spatial Pyramid Pooling with attention mechanisms further enhances feature representation and multi-scale information processing capabilities. - **Multi-Path Distance Intersection over Union (MPDIoU) loss function**: Designed around vertex distance, it effectively addresses issues of localization accuracy and boundary clarity. Experimental results show that YOLOv8-MU achieved 78.4% mAP@0.5 on the URPC2019 dataset, an improvement of 4.0% compared to the original YOLOv8; 80.9% on the URPC2020 dataset, and 75.5% on the Aquarium dataset, surpassing other models including YOLOv5 and YOLOv8n. Additionally, on the improved URPC2019 dataset, the model achieved 88.1% mAP@0.5, demonstrating its superior performance on this dataset. These results prove that the YOLOv8-MU model has broad applicability and good generalization capabilities.

YOLOv8-MU: An Improved YOLOv8 Underwater Detector Based on a Large Kernel Block and a Multi-Branch Reparameterization Module

MDM-YOLO: Research on Object Detection Algorithm Based on Improved YOLOv4 for Marine Organisms.

YoloXT: A Object Detection Algorithm for Marine Benthos

Research on Underwater Small Target Detection Algorithm Based on Improved YOLOv3

Underwater Object Detection Algorithm Based on an Improved YOLOv8

An Improved YOLOv5-Based Underwater Object-Detection Framework

YOLO‐UOD: An underwater small object detector via improved efficient layer aggregation network

Underwater Robot Target Detection Algorithm Based on YOLOv8

Underwater target detection based on improved YOLOv7

YOLOv7-SN: Underwater Target Detection Algorithm Based on Improved YOLOv7

Attention-Based Lightweight YOLOv8 Underwater Target Recognition Algorithm

A Lightweight underwater detector enhanced by Attention mechanism, GSConv and WIoU on YOLOv8

Underwater Object Detection Based on Enhanced YOLO

Underwater small target detection under YOLOv8-LA model

RSE-YOLOv8: An Algorithm for Underwater Biological Target Detection

FEB-YOLOv8: A multi-scale lightweight detection model for underwater object detection

Lightweight enhanced YOLOv8n underwater object detection network for low light environments

Improved YOLOv7 model for underwater sonar image object detection

Image-Fused-Guided Underwater Object Detection Model Based on Improved YOLOv7

An underwater target recognition method based on improved YOLOv4 in complex marine environment

YOLOv9-YX: lightweight algorithm for underwater target detection