MBIAN: Multi-level Bilateral Interactive Attention Network for Multi-Modal Image Processing

Kai Sun,Jiangshe Zhang,Jialin Wang,Shuang Xu,Chunxia Zhang,Junying Hu
DOI: https://doi.org/10.1016/j.eswa.2023.120733
IF: 8.5
2023-01-01
Expert Systems with Applications
Abstract:Convolutional neural networks (CNNs) have achieved impressive success in the multi-modal image processing (MIP) area. However, many existing CNN approaches fuse the features of the target and guidance images only once, which may cause a loss of information. To alleviate this problem, we present a multi-level bilateral interactive attention network (MBIAN) to fuse the features of the target and guidance images by their progressive interaction at different levels. Concretely, for each level, a bilateral interactive attention block (BIAB) is proposed to fuse the information of target and guidance images and refine their features. As the core component of our BIAB, a novel bilateral interactive attention layer (BIAL) is designed, where target and guidance images can mutually determine the attention weights. In addition, in each BIAB, long and short local shortcuts are employed to further facilitate the flow of information. Numerical experiments are conducted for three different problems, including panchromatic guided multi-spectral image super-resolution, near-infrared guided RGB image denoising, and flash-guided no-flash image denoising. The results demonstrate the versatility and superiority of MBIAN in terms of quantitative metrics and visual inspection, against 14 popular and state-of-the-art methods.
What problem does this paper attempt to address?