Abstract:The reconstruction of missing pixels is essential for remote sensing images, as they often suffer from problems such as covering, dead pixels, and scan line corrector (SLC)-off. Image inpainting techniques can solve these problems, as they can generate realistic content for the unknown regions of an image based on the known regions. Recently, convolutional neural network (CNN)-based inpainting methods have integrated the attention mechanism to improve inpainting performance, as they can capture long-range dependencies and adapt to inputs in a flexible manner. However, to obtain the attention map for each feature, they compute the similarities between the feature and the entire feature map, which may introduce noise from irrelevant features. To address this problem, we propose a novel adaptive attention (Ada-attention) that uses an offset position subnet to adaptively select the most relevant keys and values based on self-attention. This enables the attention to be focused on essential features and model more informative dependencies on the global range. Ada-attention first employs an offset subnet to predict offset position maps on the query feature map; then, it samples the most relevant features from the input feature map based on the offset position; next, it computes key and value maps for self-attention using the sampled features; finally, using the query, key and value maps, the self-attention outputs the reconstructed feature map. Based on Ada-attention, we customized a u-shaped adaptive-attention completing network (AACNet) to reconstruct missing regions. Experimental results on several digital remote sensing and natural image datasets, using two image inpainting models and two remote sensing image reconstruction approaches, demonstrate that the proposed AACNet achieves a good quantitative performance and good visual restoration results with regard to object integrity, texture/edge detail, and structural consistency. Ablation studies indicate that Ada-attention outperforms self-attention in terms of PSNR by 0.66%, SSIM by 0.74%, and MAE by 3.9%, and can focus on valuable global features using the adaptive offset subnet. Additionally, our approach has also been successfully applied to remove real clouds in remote sensing images, generating credible content for cloudy regions.

Video Inpainting Based on Residual Convolution Attention Network

The Improved Image Inpainting Algorithm Via Encoder and Similarity Constraint

Image Inpainting Based on Interactive Separation Network and Progressive Reconstruction Algorithm

Context Adaptive Network for Image Inpainting.

Deep video inpainting detection and localization based on ConvNeXt dual-stream network

Free-Form Image Inpainting with Separable Gate Encoder-Decoder Network

Adaptive-Attention Completing Network for Remote Sensing Image

Temporal Adaptive Alignment Network for Deep Video Inpainting.

Progressive Temporal Feature Alignment Network for Video Inpainting

Spatial-Temporal Residual Aggregation for High Resolution Video Inpainting

DANet: Deformable Alignment Network for Video Inpainting

Interactive Separation Network for Image Inpainting

A low-latency inpainting method for unstably transmitted videos

Video Inpainting by Jointly Learning Temporal Structure and Spatial Details

Structure-Guided Deep Video Inpainting

High-Resolution Image Inpainting Based On Multi-Scale Neural Network

Align-and-Attend Network for Globally and Locally Coherent Video Inpainting

DNNAM: Image inpainting algorithm via deep neural networks and attention mechanism

3DPF-FBN: Video Inpainting by Jointly 3D-Patch Filling and Neural Network Refinement

Temporal Group Fusion Network for Deep Video Inpainting

Learning Joint Spatial-Temporal Transformations for Video Inpainting