Abstract:Splicing forgery, which manipulates images by copying regions from donor images and pasting them to host images, is one of the common types of image forgery in life, where the copied regions include object regions or background regions. In order to accurately detect these forgery regions, the most mainstream approach is to use an encoder-decoder network architecture that extracts enough manipulation traces to determine whether each pixel of the input image has been spliced or not. However, due to the limited receptive field of such networks, only local manipulation traces can be learned, and therefore some large object area forgery and background forgery cannot be well localized. To address these issues, in this paper, an end-to-end splicing detection framework is proposed, which includes localization network L-Net, manipulation traces attention network MTA-Net, and adaptive multi-scale fusion module. The localization network L-Net is designed as an encoder-decoder network to extract local manipulation traces for each pixel and implement localization of splicing areas. MTA-Net uses the proposed content-remove convolutional layer (CRCL) to suppress image content information that would hinder the network from learning to manipulate traces, and then uses subsequent convolutional layers to extract features to discriminate whether the input image is a spliced image or not. In this process, the regions in the feature map of the convolutional layers with large activation values are the ones that contain global manipulation traces. These global manipulation traces are fused with the local manipulation traces learned by L-Net through the proposed adaptive multi-scale fusion module (AMSFM), thus allowing L-Net to effectively handle object forgery and background region forgery images of various sizes. Ablation experiments showed an increase of 4.6% and 3.9% in F1-score and MCC after the introduction of MTA-Net and AMSFM, respectively The splicing region detection performance on three standard datasets, CASIA, COLUMB, and CARVALHO, shows that the proposed method outperforms the state-of-the-art methods for both object forgery and background forgery, and is more robust to post-processing methods such as JPEG compression and noise addition.

Video Splicing Detection and Localization Based on Multi-Level Deep Feature Fusion and Reinforcement Learning

Different-quality Re-demosaicing in Digital Image Forensics

Image splicing localization based on re-demosaicing

A Blind Forensics Method for Image Splicing Based on Original Image Estimation Using Color Filter Array Interpolation

End-to-end Image Splicing Localization Based on Multi-Scale Features and Residual Refinement Module

Joint Manipulation Trace Attention Network and Adaptive Fusion Mechanism for Image Splicing Forgery Localization

Feature Aggregation and Region-Aware Learning for Detection of Splicing Forgery

Passive Forensic Based on Spatio-Temporal Localization of Video Object Removal Tampering

Image‐splicing forgery detection based on local binary patterns of DCT coefficients

Spatiotemporal Trident Networks: Detection and Localization of Object Removal Tampering in Video Passive Forensics

Exposing video surveillance object forgery by combining TSF features and attention-based deep neural networks

UVL2: A Unified Framework for Video Tampering Localization

DEEP-STA: Deep Learning-Based Detection and Localization of Various Types of Inter-Frame Video Tampering Using Spatiotemporal Analysis

Coarse-to-fine-grained method for image splicing region detection

Unified Video and Image Representation for Boosted Video Face Forgery Detection

Inter-frame Passive-Blind Forgery Detection for Video Shot Based on Similarity Analysis

Detecting Image Splicing Based on Noise Level Inconsistency

CNN Spatiotemporal Features and Fusion for Surveillance Video Forgery Detection

Audio splicing detection and localization using multistage filterbank spectral sketches and decision fusion

ET: Edge-Enhanced Transformer for Image Splicing Detection