Deep Fusion Local-Content And Global-Semantic For Image Inpainting

Bin Jiang,Wei Huang,Yun Huang,Chao Yang,Fangqiang Xu
DOI: https://doi.org/10.1109/ACCESS.2020.3019826
IF: 3.9
2020-01-01
IEEE Access
Abstract:The upsampling layers are adopted in almost all the existing encoder-decoder based generative adversarial networks (GANs), which have shown promising results in the image inpainting field. However, existing upsampling layers (e.g. deconvolution and bilinear interpolation) suffer from two limitations: (1) they obtain few semantic information from the global structure. (2) upsampling layer could hardly capture the local content details. To eliminate the above issues, we propose a deep Fusion local-content and global-semantic (DFLG) model that is both effective and general. The DFLG model mainly consists of four components: the Local Content-Response (LCR) module, the pixel-shuffle operator, the Global Semantic-Aware (GSA) module and the reassembly module. Firstly, the LCR module divides the channel into several groups, then utilizes the squeeze-and-excitation mechanism within each group to capture the correlation between channels. Secondly, the pixel shuffle operator reshapes depth on the channel space into width and height on the spatial space, which transforms the correlation within groups on the channel space into correlation within patches on the spatial space. Next, the GSA module employs a patch-based spatial attention mechanism to calculate the correlation between different patches. Finally, the reassembly module refines the feature map. Furthermore, we propose a novel loss function called Attention Loss (ATLoss), which guides the network to concentrate on regions with obvious artifacts. The experiments on CelebA-HQ, Places2, and Paris StreetView datasets demonstrate the effectiveness of our proposed methods in image inpainting tasks and the capability of obtaining images with higher quality.
What problem does this paper attempt to address?