Autonomous Underwater Robot for Underwater Image Enhancement Via Multi-Scale Deformable Convolution Network with Attention Mechanism

Yi Lin,Jingchun Zhou,Wenqi Ren,Weishi Zhang
DOI: https://doi.org/10.1016/j.compag.2021.106497
IF: 8.3
2021-01-01
Computers and Electronics in Agriculture
Abstract:• We develop a novel encoder-decoder architecture with an attention-based skip fusion integrated into autonomous underwater robot for underwater image enhancement. • We propose a multi-scale deformable convolutional network, which acquires deformable local receptive fields at different scales and enriches the diversity of feature representation. • We adopt a parallel attention block for spatial and channel dimensions to focus on the interest regions of the input feature map. • Our MsDCANet achieves state-of-the-art performance on the recent benchmark in terms of visual effects and object metrics, which is conducive to integrating underwater robots and applications in aquaculture. Underwater images suffer from poor visibility quality, which results from selective attenuation and scattering. These interdependence phenomena together cause image degradation, failing autonomous underwater robots to recognize image contents. To address those problems, we propose a multi-scale deformable convolution network with an attention mechanism (MsDCANet) to enhance the quality of underwater images. The proposed model is generally implemented by an encoder-decoder architecture. Concretely, we first propose a multi-scale deformable convolutional network, which acquires deformable local receptive fields at different scales and enriches the diversity of feature representation. Coupled with spatial and channel attention mechanisms, the task-oriented high-level representations acquired from input feature maps, and the most significant features are highlighted. Considering the contributions of the encoder and decoder functions on the final enhancement task, an attention-based skip fusion scheme is designed to improve the original skip connection, which aims to reconstruct higher quality underwater images. Finally, the proposed model is optimized by a multi-task loss function, including pixel loss and perceptual loss. Experiments on underwater images captured under diverse scenes demonstrate that the proposed model produces visually pleasing results and has strong generalization ability, even significantly outperforming several state-of-the-art methods. Besides, our approach can also improve the performance of vision tasks and be applied to aquaculture development.
What problem does this paper attempt to address?