YOLOv8-MU: An Improved YOLOv8 Underwater Detector Based on a Large Kernel Block and a Multi-Branch Reparameterization Module

Xing Jiang,Xiting Zhuang,Jisheng Chen,Jian Zhang,Yiwen Zhang
DOI: https://doi.org/10.3390/s24092905
IF: 3.9
2024-05-02
Sensors
Abstract:Underwater visual detection technology is crucial for marine exploration and monitoring. Given the growing demand for accurate underwater target recognition, this study introduces an innovative architecture, YOLOv8-MU, which significantly enhances the detection accuracy. This model incorporates the large kernel block (LarK block) from UniRepLKNet to optimize the backbone network, achieving a broader receptive field without increasing the model's depth. Additionally, the integration of C2fSTR, which combines the Swin transformer with the C2f module, and the SPPFCSPC_EMA module, which blends Cross-Stage Partial Fast Spatial Pyramid Pooling (SPPFCSPC) with attention mechanisms, notably improves the detection accuracy and robustness for various biological targets. A fusion block from DAMO-YOLO further enhances the multi-scale feature extraction capabilities in the model's neck. Moreover, the adoption of the MPDIoU loss function, designed around the vertex distance, effectively addresses the challenges of localization accuracy and boundary clarity in underwater organism detection. The experimental results on the URPC2019 dataset indicate that YOLOv8-MU achieves an mAP@0.5 of 78.4%, showing an improvement of 4.0% over the original YOLOv8 model. Additionally, on the URPC2020 dataset, it achieves 80.9%, and, on the Aquarium dataset, it reaches 75.5%, surpassing other models, including YOLOv5 and YOLOv8n, thus confirming the wide applicability and generalization capabilities of our proposed improved model architecture. Furthermore, an evaluation on the improved URPC2019 dataset demonstrates leading performance (SOTA), with an mAP@0.5 of 88.1%, further verifying its superiority on this dataset. These results highlight the model's broad applicability and generalization capabilities across various underwater datasets.
engineering, electrical & electronic,instruments & instrumentation,chemistry, analytical
What problem does this paper attempt to address?
The paper primarily addresses the challenges faced by underwater target detection technology and proposes an improved YOLOv8 model (named YOLOv8-MU) aimed at enhancing the accuracy and robustness of target detection in underwater environments. Specifically, the paper addresses the following key issues: 1. **Unstable lighting conditions**: Light absorption and scattering in underwater environments lead to reduced image contrast, affecting the distinction between targets and background. 2. **Poor image quality**: Factors such as water flow, suspended particles, and bubbles cause image blurring and distortion, further reducing recognition accuracy. 3. **Target diversity**: The wide variety of underwater organisms, with different shapes and sizes, increases the complexity of the detection task. 4. **Noise interference**: External factors such as waves, bubbles, and floating objects interfere with the detection and recognition process. To overcome the above challenges, the authors propose the following key technical improvements: - **Adoption of LarK block**: Introduced from UniRepLKNet, it optimizes the backbone network to achieve a wider receptive field without increasing the model depth. - **Integration of C2fSTR module**: Combining Swin Transformer with the C2f module enhances the fusion of information at different scales, improving the detection accuracy and robustness for various biological targets. - **Incorporation of SPPFCSPC_EMA module**: Combining Cross-Stage Partial Fast Spatial Pyramid Pooling with attention mechanisms further enhances feature representation and multi-scale information processing capabilities. - **Multi-Path Distance Intersection over Union (MPDIoU) loss function**: Designed around vertex distance, it effectively addresses issues of localization accuracy and boundary clarity. Experimental results show that YOLOv8-MU achieved 78.4% mAP@0.5 on the URPC2019 dataset, an improvement of 4.0% compared to the original YOLOv8; 80.9% on the URPC2020 dataset, and 75.5% on the Aquarium dataset, surpassing other models including YOLOv5 and YOLOv8n. Additionally, on the improved URPC2019 dataset, the model achieved 88.1% mAP@0.5, demonstrating its superior performance on this dataset. These results prove that the YOLOv8-MU model has broad applicability and good generalization capabilities.