SonarNet: Hybrid CNN-Transformer-HOG Framework and Multifeature Fusion Mechanism for Forward-Looking Sonar Image Segmentation

Ju He,Jianfeng Chen,Hu Xu,Yang Yu
DOI: https://doi.org/10.1109/tgrs.2024.3368659
IF: 8.2
2024-03-02
IEEE Transactions on Geoscience and Remote Sensing
Abstract:Forward-looking sonar (FLS) image segmentation plays a significant role in ocean engineering. However, the existing image segmentation algorithms present difficulties in extracting features from FLS images with weak semantic information, complex backgrounds, and strong environmental noises. Convolutional neural networks (CNNs) have demonstrated remarkable capabilities in semantic segmentation tasks, but the locality of convolution limits the ability to extract global context and long-range semantic information. The effective extraction of global contextual information is indispensable for achieving accurate segmentation results in sonar image processing. In this article, we propose a novel semantic segmentation architecture for FLS images called SonarNet. SonarNet is based on a hybrid CNN-transformer-HOG framework and comprises four modules: 1) the global–local encoder can extract both global and detailed feature information of the underwater target; 2) the network decoder converts the high-semantic feature map into a pixel-level classification; 3) as a bridge between dual encoders, the global–local fusion module ensures semantic consistency between different encoders; and 4) the HOG feature encoder and fusion can extract traditional manual features and perform feature alignment. We conducted comprehensive ablation experiments to validate the efficacy of the designed modules. Finally, experimentation revealed that SonarNet significantly outperforms other CNN-based and CNN-transformer FLS image segmentation methods.
imaging science & photographic technology,remote sensing,engineering, electrical & electronic,geochemistry & geophysics
What problem does this paper attempt to address?