Abstract:Safeguarding intellectual property and preventing potential misuse of AI-generated images are of paramount importance. This paper introduces a robust and agile plug-and-play watermark detection framework, dubbed as RAW. As a departure from traditional encoder-decoder methods, which incorporate fixed binary codes as watermarks within latent representations, our approach introduces learnable watermarks directly into the original image data. Subsequently, we employ a classifier that is jointly trained with the watermark to detect the presence of the watermark. The proposed framework is compatible with various generative architectures and supports on-the-fly watermark injection after training. By incorporating state-of-the-art smoothing techniques, we show that the framework provides provable guarantees regarding the false positive rate for misclassifying a watermarked image, even in the presence of certain adversarial attacks targeting watermark removal. Experiments on a diverse range of images generated by state-of-the-art diffusion models reveal substantial performance enhancements compared to existing approaches. For instance, our method demonstrates a notable increase in AUROC, from 0.48 to 0.82, when compared to state-of-the-art approaches in detecting watermarked images under adversarial attacks, while maintaining image quality, as indicated by closely aligned FID and CLIP scores.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is to protect the intellectual property rights of AI - generated images and prevent the potential misuse of these images. Specifically, the paper introduces a robust and flexible plug - and - play watermark detection framework named RAW (Robust and Agile Watermark), aiming to provide an effective watermark protection mechanism for AI - generated images. Different from the traditional encoder - decoder methods, the RAW framework directly embeds a learnable watermark in the original image data instead of embedding a fixed binary code in the latent representation. In addition, this framework also uses a classifier to detect the presence of the watermark, and this classifier is jointly trained with the watermark. Through this method, the RAW framework can be compatible with various generation architectures and support immediate watermark injection after training. The paper also points out that by introducing advanced smoothing techniques, the framework can provide provable guarantees regarding the false positive rate, and this property can be maintained even in the presence of certain adversarial attacks targeting watermark removal. The main contributions of the paper include: 1. **Proposing a new watermark learning framework**: Different from the traditional encoder - decoder techniques, the RAW framework embeds a learnable watermark that matches the image size in the frequency domain and the spatial domain of the image, and uses a convolutional neural network (CNN) as a classifier to conduct the joint training of the watermark and the classifier. 2. **Providing provable guarantees of the false positive rate under adversarial attacks**: By combining advanced methods from the consistent prediction literature, the RAW framework can provide strict, distribution - independent false positive rate guarantees. In addition, the paper also develops a new technique inspired by random smoothing to further enhance its provable guarantees. 3. **Extensive empirical research**: The paper evaluates the effectiveness of the proposed method on multiple datasets, including data generated by DBDiffusion and MS - COCO. The experimental results show that the RAW method performs excellently in terms of detection performance, robustness to image manipulation/attacks, computational efficiency of watermark injection, and the quality of generated images. For example, under the adversarial attack of the state - of - the - art diffusion model, the AUROC of the RAW method is increased from 0.48 to 0.82. Overall, the RAW framework aims to provide an efficient, robust, and easy - to - integrate watermark protection scheme for AI - generated images, which is especially suitable for real - time deployment and use by third - party users.

RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images with Provable Guarantees

REFIT: A UnifiedWatermark Removal Framework for Deep Learning Systems with Limited Data

Warfare:Breaking the Watermark Protection of AI-Generated Content

RAWIW: RAW Image Watermarking Robust to ISP Pipeline

InvisMark: Invisible and Robust Watermarking for AI-generated Image Provenance

Certifiably Robust Image Watermark

Unified High-binding Watermark for Unconditional Image Generation Models

Seeds Don't Lie: An Adaptive Watermarking Framework for Computer Vision Models

Evading Watermark based Detection of AI-Generated Content

SoK: Watermarking for AI-Generated Content

Robust Adversarial Watermark Defending Against GAN Synthesization Attack

Achieving Resolution-Agnostic DNN-based Image Watermarking: A Novel Perspective of Implicit Neural Representation

A Training-Free Plug-and-Play Watermark Framework for Stable Diffusion

Robust-Wide: Robust Watermarking against Instruction-driven Image Editing

Invisible Image Watermarks Are Provably Removable Using Generative AI

Scalable Universal Adversarial Watermark Defending Against Facial Forgery

Robust Identity Perceptual Watermark Against Deepfake Face Swapping

Towards Photo-Realistic Visible Watermark Removal with Conditional Generative Adversarial Networks

Robust Image Watermarking using Stable Diffusion