Abstract:Text-to-image diffusion models, such as Stable Diffusion, have shown exceptional potential in generating high-quality images. However, recent studies highlight concerns over the use of unauthorized data in training these models, which may lead to intellectual property infringement or privacy violations. A promising approach to mitigate these issues is to apply a watermark to images and subsequently check if generative models reproduce similar watermark features. In this paper, we examine the robustness of various watermark-based protection methods applied to text-to-image models. We observe that common image transformations are ineffective at removing the watermark effect. Therefore, we propose \tech{}, that leverages the diffusion process to conduct controlled image generation on the protected input, preserving the high-level features of the input while ignoring the low-level details utilized by watermarks. A small number of generated images are then used to fine-tune protected models. Our experiments on three datasets and 140 text-to-image diffusion models reveal that existing state-of-the-art protections are not robust against RATTAN.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the issue of unauthorized data use in text - to - image diffusion models (such as Stable Diffusion). Specifically, these models may inadvertently or deliberately use data containing intellectual property rights or private information during the training process, leading to intellectual property infringement or privacy leakage. To alleviate these problems, existing research has proposed a watermark - based method, that is, adding a watermark to the image and checking for similar watermark features in the generated image to detect unauthorized data use. However, the author observes that the existing watermark protection methods are not robust enough in the face of some common image transformations, and these transformations cannot effectively remove the watermark effect. Therefore, the author proposes a new method named RATTAN. This method aims to extract high - level features from protected inputs while ignoring low - level details (such as watermarks) through controlled image generation using the diffusion process. A small number of images generated by RATTAN are then used to fine - tune the protected model to reduce the detection rate of existing protection methods. The main contributions of the paper are as follows: 1. **Evaluating the robustness of existing watermark protection methods**: The author tested a variety of common image transformations and found that these transformations have limited effectiveness in removing watermark effects. 2. **Proposing the RATTAN method**: Through controlled image generation technology, RATTAN can remove low - level details (such as watermarks) while retaining the high - level features of the input image, thus effectively bypassing the existing watermark protection mechanisms. 3. **Experimental verification**: The author conducted experiments on three datasets and 140 text - to - image diffusion models. The results show that RATTAN can significantly reduce the detection rate of existing protection methods to 50%, which is equivalent to random guessing. In conclusion, this paper aims to provide an effective solution to prevent unauthorized data use in text - to - image diffusion models while ensuring that the quality of the generated images is not affected.

Exploiting Watermark-Based Defense Mechanisms in Text-to-Image Diffusion Models for Unauthorized Data Usage

Warfare:Breaking the Watermark Protection of AI-Generated Content

Robustness of Watermarking on Text-to-Image Diffusion Models

Watermarking for Stable Diffusion Models

A Watermark-Conditioned Diffusion Model for IP Protection

FT-Shield: A Watermark Against Unauthorized Fine-tuning in Text-to-Image Diffusion Models

SleeperMark: Towards Robust Watermark against Fine-Tuning Text-to-image Diffusion Models

DIAGNOSIS: Detecting Unauthorized Data Usages in Text-to-image Diffusion Models

Embedding Watermarks in Diffusion Process for Model Intellectual Property Protection

Intellectual Property Protection of Diffusion Models via the Watermark Diffusion Process

Ambiguity attack against text-to-image diffusion model watermarking

Detecting Dataset Abuse in Fine-Tuning Stable Diffusion Models for Text-to-Image Synthesis

Hey That's Mine Imperceptible Watermarks are Preserved in Diffusion Generated Outputs

Stable Signature is Unstable: Removing Image Watermark from Diffusion Models

Watermark-embedded Adversarial Examples for Copyright Protection against Diffusion Models

Attack-Resilient Image Watermarking Using Stable Diffusion

Reliable Model Watermarking: Defending Against Theft without Compromising on Evasion

A Somewhat Robust Image Watermark against Diffusion-based Editing Models

Robust Image Watermarking using Stable Diffusion

Watermarking Diffusion Model

CopyrightMeter: Revisiting Copyright Protection in Text-to-image Models