OmniGuard: Hybrid Manipulation Localization via Augmented Versatile Deep Image Watermarking

Xuanyu Zhang,Zecheng Tang,Zhipei Xu,Runyi Li,Youmin Xu,Bin Chen,Feng Gao,Jian Zhang
2024-12-02
Abstract:With the rapid growth of generative AI and its widespread application in image editing, new risks have emerged regarding the authenticity and integrity of digital content. Existing versatile watermarking approaches suffer from trade-offs between tamper localization precision and visual quality. Constrained by the limited flexibility of previous framework, their localized watermark must remain fixed across all images. Under AIGC-editing, their copyright extraction accuracy is also unsatisfactory. To address these challenges, we propose OmniGuard, a novel augmented versatile watermarking approach that integrates proactive embedding with passive, blind extraction for robust copyright protection and tamper localization. OmniGuard employs a hybrid forensic framework that enables flexible localization watermark selection and introduces a degradation-aware tamper extraction network for precise localization under challenging conditions. Additionally, a lightweight AIGC-editing simulation layer is designed to enhance robustness across global and local editing. Extensive experiments show that OmniGuard achieves superior fidelity, robustness, and flexibility. Compared to the recent state-of-the-art approach EditGuard, our method outperforms it by 4.25dB in PSNR of the container image, 20.7% in F1-Score under noisy conditions, and 14.8% in average bit accuracy.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The main problems that this paper attempts to solve are several key challenges faced by existing deep image watermarking techniques when dealing with generative AI (AIGC) editing and image tampering. Specifically, these problems include: 1. **Fidelity**: There is an inevitable trade - off between the tampering location accuracy and the fidelity of the watermarked image in existing multi - purpose image watermarking methods. In order to ensure satisfactory location accuracy, the quality of the watermarked image often needs to be sacrificed. 2. **Flexibility**: Since the predefined location watermark must be known in the decoding stage to extract the mask, these watermarks are usually fixed in all images, which greatly limits the flexibility of information embedding. 3. **Robustness**: Existing multi - purpose watermarking methods perform poorly in the face of severe degradations (such as brightness adjustment or heavy image compression), especially since global AIGC editing algorithms may delete copyright watermarks, further weakening their effectiveness. To solve these problems, the author proposes a new enhanced multi - purpose watermarking framework - OmniGuard. By combining active watermark embedding with a passive blind extraction network, this framework aims to improve the accuracy of image tampering location, robustness, and the fidelity of the watermarked image. Specific improvements include: - **Hybrid forensics framework**: A deep degradation - aware tampering extractor is introduced, which can more accurately extract the tampered area under severe degradation conditions. - **Adaptive watermark transformation**: An adaptive watermark transformation mechanism is designed, enabling the network to selectively embed location information according to the content, thereby improving the fidelity of the container image. - **Light - weight AIGC editing simulation layer**: The accuracy of copyright extraction of the network under global and local editing is enhanced, ensuring that copyright can still be effectively protected under various editing operations. Through these improvements, OmniGuard shows significant advantages over existing methods in multiple evaluation metrics, such as PSNR, F1 - Score, and average bit accuracy. ### Summary of Mathematical Formulas The key formulas involved in the paper are as follows: 1. **Tampered image model**: \[ I_{\text{rec}} = D_{\text{deg}}(I_{\text{con}} \odot (1 - M_{\text{gt}}) + E_{\text{edit}}(I_{\text{con}}) \odot M_{\text{gt}}) \] where \( E_{\text{edit}}(\cdot) \), \( D_{\text{deg}}(\cdot) \), and \( M_{\text{gt}} \) represent the editing function, the degradation function, and the tampering mask respectively. 2. **Tampered area calculation**: \[ \hat{M}_{\text{loc}} = \Theta_\tau(|\hat{W}_{\text{loc}} - W_{\text{loc}}|) \] where \( \Theta_\tau(z) = 1 \) when \( z \geq \tau \), and \( | \cdot | \) represents the absolute value operation. 3. **Enhanced pseudo - code**: \[ \tilde{T}_{\text{loc}} = T_{\text{loc}} + \beta \cdot \text{Conv}(F_{\text{rec}} \odot Q_{\text{deg}}) \] where \( F_{\text{rec}}=\text{Sigmoid}(\text{MLP}(\text{GAP}(T_{\text{rec}}))) \), \( \beta \) is a learnable trade - off parameter, and \( \odot \) represents element - wise multiplication. 4. **Loss function**: \[ \ell_{\text{cop}}=\ell_{\text{bce}}(\hat{w}_{\text{cop}},