Pixel-Inconsistency Modeling for Image Manipulation Localization

Chenqi Kong,Anwei Luo,Shiqi Wang,Haoliang Li,Anderson Rocha,Alex C. Kot
2023-09-30
Abstract:Digital image forensics plays a crucial role in image authentication and manipulation localization. Despite the progress powered by deep neural networks, existing forgery localization methodologies exhibit limitations when deployed to unseen datasets and perturbed images (i.e., lack of generalization and robustness to real-world applications). To circumvent these problems and aid image integrity, this paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts. The rationale is grounded on the observation that most image signal processors (ISP) involve the demosaicing process, which introduces pixel correlations in pristine images. Moreover, manipulating operations, including splicing, copy-move, and inpainting, directly affect such pixel regularity. We, therefore, first split the input image into several blocks and design masked self-attention mechanisms to model the global pixel dependency in input images. Simultaneously, we optimize another local pixel dependency stream to mine local manipulation clues within input forgery images. In addition, we design novel Learning-to-Weight Modules (LWM) to combine features from the two streams, thereby enhancing the final forgery localization performance. To improve the training process, we propose a novel Pixel-Inconsistency Data Augmentation (PIDA) strategy, driving the model to focus on capturing inherent pixel-level artifacts instead of mining semantic forgery traces. This work establishes a comprehensive benchmark integrating 15 representative detection models across 12 datasets. Extensive experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints and achieve state-of-the-art generalization and robustness performances in image manipulation localization.
Cryptography and Security,Computer Vision and Pattern Recognition,Image and Video Processing
What problem does this paper attempt to address?
This paper attempts to solve two major problems in the field of image forensics: **insufficient generalization ability** and **poor robustness**. Specifically, the existing forgery localization methods show limitations when applied to unseen datasets and perturbed images, and cannot be well generalized to real - world applications. To solve these problems, the author proposes an image manipulation localization method based on pixel - inconsistency modeling. ### 1. Research Background and Problems With the progress of digital image processing technology, image tampering (such as splicing, copy - move, inpainting, etc.) has become more and more complex and difficult to detect. These tampering operations will destroy the pixel regularity in the original image, especially the periodic pattern introduced in the demosaicing process. Therefore, how to effectively detect and localize these tampered areas has become an important research topic. ### 2. Main Contributions of the Paper In order to improve the generalization ability and robustness of image tampering localization, this paper proposes the following innovations: - **Two - stream Pixel - Dependence Modeling Framework**: Capture pixel - inconsistencies in the image by designing a local - pixel - dependence encoder and a global - pixel - dependence encoder. The local encoder uses Pixel - Difference Convolution (PDC) blocks to capture pixel - inconsistencies within local regions, while the global encoder models the global pixel - dependence relationships in the input image through the Masked Self - Attention mechanism. - **Learning - Weighted Module (LWM)**: In order to better fuse local and global features, the author introduces a learning - weighted module, which can dynamically adjust the importance of features according to the learned weights. - **Pixel - Inconsistency Data - Augmentation Strategy (PIDA)**: To improve the training process, the author proposes a new data - augmentation strategy, that is, generating forged samples only from real images. This strategy makes the model focus more on capturing pixel - level inconsistencies rather than semantic - level forgery traces. ### 3. Experimental Results The experimental results show that the proposed model exhibits excellent generalization ability and robustness on multiple datasets. By introducing pixel - inconsistency modeling, the model can extract forgery fingerprints more accurately and still maintain high performance under different types of image perturbations. ### 4. Formula Representation Some key formulas involved in the paper are as follows: - Formula of the Masked Self - Attention mechanism: \[ z_{i + 1}=\text{Mask}\left[\text{softmax}\left(\frac{f_{\text{query}}(z_i) f_{\text{key}}(z_i)^{\top}}{\sqrt{d}}\right)\right] f_{\text{value}}(z_i) \] - Formulas of Pixel - Difference Convolution (CPDC and RPDC): \[ f_C^l=\sum_{(x_i, x_c)\in\Omega} w_i(x_i - x_c) \] \[ f_R^l=\sum_{(x_i, x'_i)\in\Omega} w_i(x_i - x'_i) \] - Formula of the Learning - Weighted Module (LWM): \[ f_F = f_1\oplus f_2+A_1\odot f_1+A_2\odot f_2 \] Through these innovations, this paper provides a more general and powerful method for image tampering localization, which significantly improves the performance of existing techniques. --- Hope this summary can help you understand the core content and innovation points of this paper. If you have more questions or need further explanations, please feel free to ask!