Abstract:Image inpainting based on generative adversarial networks (GANs) has achieved great success in producing visually plausible images and plays an important role in many real tasks. However, the techniques of image inpainting might also be maliciously used, e.g., altering or removing interesting objects to report fake news. Despite the promising performance of recently developed inpainting detection algorithms, they are built on convolutional neural networks (CNNs) with limited receptive fields. Consequently, they fail to fully capture the disparity between the inpainted regions and untouched regions and thus are ineffective in obtaining fine-grained detection results. In this work, we develop a new image inpainting detection approach. First, we propose a locally enhanced transformer architecture tailored for image inpainting detection. Unlike previous CNN-based methods, our approach leverages both the short-range and long-range dependencies of pixels, enabling the learning of diverse statistical behaviors of inpainted and untouched regions. Second, to mitigate the distraction caused by near-edge pixels with a mixed nature during training, we propose decoupling the label into a body map and a soft-edge map, and then a cross-modality attention module is designed to propagate their information interactively. It demonstrates that our decoupling strategy outperforms the conventional edge supervision in enhancing detection accuracy. Finally, we devise a constrained adversarial training methodology in consideration of the confrontational generation procedure of deep image inpainting methods. It shows that our constrained adversarial training further enhances the detection performance by adaptively introducing interference noise in the inpainted regions. Extensive experiments validate the superiority of our scheme compared to existing CNN-based methods, showcasing its desirable detection generalizability for both deep inpainting and traditional inpainting algorithms.

A transformer–CNN for deep image inpainting forensics

Rethinking Fast Fourier Convolution in Image Inpainting

A Double Feature Fusion Network with Progressive Learning for Sharper Inpainting

CTNet: hybrid architecture based on CNN and transformer for image inpainting detection

Free-Form Image Inpainting with Separable Gate Encoder-Decoder Network

A Frequency Attention-Based Dual-Stream Network for Image Inpainting Forensics

Parallel Multi-Resolution Fusion Network for Image Inpainting.

ITrans: generative image inpainting with transformers

Image Inpainting Detection Based on Multi-task Deep Learning Network

Image Inpainting Based on Interactive Separation Network and Progressive Reconstruction Algorithm

Robust Image Inpainting Forensics by Using an Attention-Based Feature Pyramid Network

Transformer-Based Image Inpainting Detection via Label Decoupling and Constrained Adversarial Training

Deep video inpainting detection and localization based on ConvNeXt dual-stream network

The Improved Image Inpainting Algorithm Via Encoder and Similarity Constraint

Delving Globally into Texture and Structure for Image Inpainting

HINT: High-quality INPainting Transformer with Mask-Aware Encoding and Enhanced Attention

Progressive Feedback-Enhanced Transformer for Image Forgery Localization

DMFF-Net: Double-stream multilevel feature fusion network for image forgery localization

Distillation-guided Image Inpainting

Bridging partial-gated convolution with transformer for smooth-variation image inpainting