Shuffled Grouping Cross-Channel Attention-based Bilateral-Filter-Interpolation Deformable ConvNet with Applications to Benthonic Organism Detection

Tingkai Chen,Ning Wang
DOI: https://doi.org/10.1109/tai.2024.3385387
2024-01-01
Abstract:In this paper, to holistically tackle underwater detection degradation due to unknown geometric variation arising from scale, pose, viewpoint and occlusion under low-contrast and color-distortion circumstances, a shuffled grouping cross-channel attention-based bilateral-filter-interpolation deformable ConvNet (SGCA-BDC) framework is established for benthonic organism detection. Main contributions are as follows: 1) By comprehensively considering spatial and feature similarities between offset and integral coordinate positions, the bilateral-filter-interpolation deformable ConvNet (BDC) with modulation weight mechanism is created, such that sampling ability of convolutional kernel for benthonic organism with unknown geometric variation can be adaptively augmented from spatial perspective. 2) By utilizing 1-D convolution to recalibrate channel weight for grouped sub-feature via information entropy statistic technique, a shuffled grouping cross-channel attention (SGCA) module is innovated, such that seabed background noise can be suppressed from channel aspect. 3) The proposed SGCA-BDC scheme is eventually built in an organic manner by incorporating BDC and SGCA modules. Comprehensive experiments and comparisons demonstrate that the SGCA-BDC scheme remarkably outperforms typical detection approaches including Faster RCNN, SSD, YOLOv6, YOLOv7, YOLOv8, RetinaNet and CenterNet in terms of mean average precision by 8.54%, 4.4%, 5.18%, 3.1%, 3.01%, 12.53% and 7.09%, respectively.
What problem does this paper attempt to address?