Enhanced YOLOv7 with three-dimensional attention and its application into underwater object detection

Yi Qin,Chen Liang,Yongfang Mao,Mingliang Zhou
DOI: https://doi.org/10.1007/s11042-024-19966-3
IF: 2.577
2024-01-01
Multimedia Tools and Applications
Abstract:In recent years, with the increasing importance of marine resources, robots have been widely used in underwater environment exploration, seafood delivery and fishing. The visual perception system of the robot is equivalent to the "eyes" of the robot, which is an indispensable part of the robot to realize control and autonomous navigation. Object detection technology is an important part of the robot visual perception system, which determines the accuracy of the robot’s surrounding environment perception. YOLOv7 algorithm has the advantages of few parameters, fast detection speed and high accuracy. However, for underwater overlapping targets, YOLOv7 has the problems of missed detection and false detection and difficulty in separating target and background. Therefore, this paper proposes a new YOLO algorithm aiming at the missed detection problem of multiple overlapping targets. A three-dimensional attention convolution module is proposed to successfully identify multiple overlapping echinus and holothurian. Aiming at the difficulty of separating target and background, the upsampling process is redesigned to more fully retain the deep semantic information. Experiments were carried out on a dataset containing four kinds of underwater targets, and mAP was improved by 0.7
What problem does this paper attempt to address?