Joint Manipulation Trace Attention Network and Adaptive Fusion Mechanism for Image Splicing Forgery Localization
Yuanlu Wu,Yan Wo,Guoqiang Han
DOI: https://doi.org/10.1007/s11042-022-13151-0
IF: 2.577
2022-01-01
Multimedia Tools and Applications
Abstract:Splicing forgery, which manipulates images by copying regions from donor images and pasting them to host images, is one of the common types of image forgery in life, where the copied regions include object regions or background regions. In order to accurately detect these forgery regions, the most mainstream approach is to use an encoder-decoder network architecture that extracts enough manipulation traces to determine whether each pixel of the input image has been spliced or not. However, due to the limited receptive field of such networks, only local manipulation traces can be learned, and therefore some large object area forgery and background forgery cannot be well localized. To address these issues, in this paper, an end-to-end splicing detection framework is proposed, which includes localization network L-Net, manipulation traces attention network MTA-Net, and adaptive multi-scale fusion module. The localization network L-Net is designed as an encoder-decoder network to extract local manipulation traces for each pixel and implement localization of splicing areas. MTA-Net uses the proposed content-remove convolutional layer (CRCL) to suppress image content information that would hinder the network from learning to manipulate traces, and then uses subsequent convolutional layers to extract features to discriminate whether the input image is a spliced image or not. In this process, the regions in the feature map of the convolutional layers with large activation values are the ones that contain global manipulation traces. These global manipulation traces are fused with the local manipulation traces learned by L-Net through the proposed adaptive multi-scale fusion module (AMSFM), thus allowing L-Net to effectively handle object forgery and background region forgery images of various sizes. Ablation experiments showed an increase of 4.6% and 3.9% in F1-score and MCC after the introduction of MTA-Net and AMSFM, respectively The splicing region detection performance on three standard datasets, CASIA, COLUMB, and CARVALHO, shows that the proposed method outperforms the state-of-the-art methods for both object forgery and background forgery, and is more robust to post-processing methods such as JPEG compression and noise addition.