A novel target detection method with dual‐domain multi‐frequency feature in side‐scan sonar images

Wen Wang,Yifan Zhang,Houpu Li,Yixin Kang,Lei Liu,Cheng Chen,Guojun Zhai
DOI: https://doi.org/10.1049/ipr2.13241
IF: 2.3
2024-09-20
IET Image Processing
Abstract:A pioneering dual‐domain multi‐frequency network meticulously crafted is introduced to harness the distinct characteristics of side‐scan sonar image detection. In dual‐domain multi‐frequency network, aiming at the underwater detection requirements of small scenes, a novel method for optimizing and improving the detection sensitivity of different frequency ranges called multi‐frequency combined attention mechanism is proposed. Moreover, recognizing that side‐scan sonar images can provide richer insights after frequency domain conversion, a dual‐domain feature pyramid network is introduced. The results of the methods showcase their state‐of‐the‐art performance. Side‐scan sonar (SSS) detection is a key method in underwater environmental security and subsea resource development. However, many detection approaches primarily concentrate on tracking the evolution path of optical image object detection tasks when using acoustic images, resulting in complex structures and limited versatility. To tackle this issue, we introduce a pioneering dual‐domain multi‐frequency network (D2MFNet) meticulously crafted to harness the distinct characteristics of SSS image detection. In D2MFNet, a novel method for optimizing and improving the detection sensitivity in different frequency ranges called multi‐frequency combined attention mechanism (MFCAM) is proposed. This mechanism amplifies the relevance of dual‐domain features across different channels and spaces. Moreover, we introduce a dual‐domain feature pyramid network (D2FPN) significantly augments the depth and breadth of feature information in underwater small datasets. The methods offer plug‐and‐play functionality with substantial performance enhancements. Extensive experiments are conducted to validate the efficacy of the proposed techniques, and the results showcase their state‐of‐the‐art performance. MFCAM improves the mAP by 16.9% in the KLSG dataset and 15.5% in the SCTD dataset. The mAP of D2FPN was improved by 8.4% in the KLSG dataset and by 9.8% in the SCTD dataset. The code and models will be publicly available at https://dagshub.com/estrellaww00/D2MFNet.
computer science, artificial intelligence,engineering, electrical & electronic,imaging science & photographic technology
What problem does this paper attempt to address?