A transformer–CNN for deep image inpainting forensics

Xinshan Zhu,Junyan Lu,Honghao Ren,Hongquan Wang,Biao Sun
DOI: https://doi.org/10.1007/s00371-022-02620-0
2022-08-06
Abstract:As an advanced image editing technology, image inpainting leaves very weak traces in the tampered image, causing serious security issues, particularly those based on deep learning. In this paper, we propose the global–local feature fusion network (GLFFNet) to locate the image regions tampered by inpainting based on deep learning. GLFFNet consists of a two-stream encoder and a decoder. In the two-stream encoder, a spatial self-attention stream (SSAS) and a noise feature extraction stream (NFES) are designed. By a transformer network, the SSAS extracts global features regarding deep inpainting manipulations. The NFES is constructed by the residual blocks, which are used to learn manipulation features from noise maps produced by filtering the input image. Through a feature fusion layer, the features output by the encoder is fused and then fed into the decoder, where the up-sampling and convolutional operations are employed to derive the confidential map for inpainting manipulation. The proposed network is trained by the designed two-stage loss function. Experimental results show that GLFFNet achieves a high location accuracy for deep inpainting manipulations and effectively resists JPEG compression and additive noise attacks.
computer science, software engineering
What problem does this paper attempt to address?