Abstract:Digital video inpainting techniques have been substantially improved with deep learning in recent years. Although inpainting is originally designed to repair damaged areas, it can also be used as malicious manipulation to remove important objects for creating false scenes and facts. As such it is significant to identify inpainted regions blindly. In this paper, we present a Trusted Video Inpainting Localization network (TruVIL) with excellent robustness and generalization ability. Observing that high-frequency noise can effectively unveil the inpainted regions, we design deep attentive noise learning in multiple stages to capture the inpainting traces. Firstly, a multi-scale noise extraction module based on 3D High Pass (HP3D) layers is used to create the noise modality from input RGB frames. Then the correlation between such two complementary modalities are explored by a cross-modality attentive fusion module to facilitate mutual feature learning. Lastly, spatial details are selectively enhanced by an attentive noise decoding module to boost the localization performance of the network. To prepare enough training samples, we also build a frame-level video object segmentation dataset of 2500 videos with pixel-level annotation for all frames. Extensive experimental results validate the superiority of TruVIL compared with the state-of-the-arts. In particular, both quantitative and qualitative evaluations on various inpainted videos verify the remarkable robustness and generalization ability of our proposed TruVIL. Code and dataset will be available at <a class="link-external link-https" href="https://github.com/multimediaFor/TruVIL" rel="external noopener nofollow">this https URL</a>.

Deep video inpainting detection and localization based on ConvNeXt dual-stream network

Depth Map Inpainting Using a Fully Convolutional Network

Video Inpainting by Jointly Learning Temporal Structure and Spatial Details

Noise Doesn't Lie: Towards Universal Detection of Deep Inpainting

Detecting Inpainted Video with Frequency Domain Insights

Image Inpainting Based on Interactive Separation Network and Progressive Reconstruction Algorithm

Trusted Video Inpainting Localization via Deep Attentive Noise Learning

Video Inpainting Localization with Contrastive Learning

Dilated Residual Encode-Decode Networks for Image Denoising

Dense Feature Interaction Network for Image Inpainting Localization

Inpainting with Sketch Reconstruction and Comprehensive Feature Selection

A Double Feature Fusion Network with Progressive Learning for Sharper Inpainting

Recurrent Temporal Aggregation Framework for Deep Video Inpainting

Depth-guided Deep Video Inpainting

A Temporally-Aware Interpolation Network for Video Frame Inpainting

A low-latency inpainting method for unstably transmitted videos

Image Inpainting Detection Based on Multi-task Deep Learning Network

Inpainting with Separable Mask Update Convolution Network

Learnable Gated Temporal Shift Module for Deep Video Inpainting

Spatial-Temporal Residual Aggregation for High Resolution Video Inpainting

A transformer–CNN for deep image inpainting forensics