Abstract:Recent text-to-image diffusion models have shown surprising performance in generating high-quality images. However, concerns have arisen regarding the unauthorized data usage during the training or fine-tuning process. One example is when a model trainer collects a set of images created by a particular artist and attempts to train a model capable of generating similar images without obtaining permission and giving credit to the artist. To address this issue, we propose a method for detecting such unauthorized data usage by planting the injected memorization into the text-to-image diffusion models trained on the protected dataset. Specifically, we modify the protected images by adding unique contents on these images using stealthy image warping functions that are nearly imperceptible to humans but can be captured and memorized by diffusion models. By analyzing whether the model has memorized the injected content (i.e., whether the generated images are processed by the injected post-processing function), we can detect models that had illegally utilized the unauthorized data. Experiments on Stable Diffusion and VQ Diffusion with different model training or fine-tuning methods (i.e, LoRA, DreamBooth, and standard training) demonstrate the effectiveness of our proposed method in detecting unauthorized data usages. Code: <a class="link-external link-https" href="https://github.com/ZhentingWang/DIAGNOSIS" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to detect unauthorized data use during the training or fine - tuning of text - to - image diffusion models. Specifically, unauthorized data use occurs when a model trainer collects a set of images created by a particular artist and attempts to train a model capable of generating similar images without obtaining permission and giving credit to the artist. To address this issue, the paper proposes a method to detect such unauthorized data use by injecting memory into text - to - image diffusion models on protected datasets. ### Specific problems solved by the paper: 1. **Unauthorized data use**: The paper focuses on how to detect unauthorized data use when training or fine - tuning text - to - image diffusion models. For example, a model trainer may collect an artist's portfolio and attempt to train a model capable of generating similar images without obtaining the artist's permission. 2. **Detection method of injecting memory**: The paper proposes a method to modify protected images by adding unique, almost imperceptible content (called "signal function") to the images. These modified images can be captured and remembered by the diffusion model. By analyzing whether the model has remembered these injected contents, it can be detected whether the model has illegally used unauthorized data. ### Method overview: - **Signal function**: Add unique content to protected images by applying a secret image distortion function called "signal function" on the images. These contents are almost invisible to humans but can be captured and remembered by the diffusion model. - **Signal classifier**: Train a binary classifier (called "signal classifier") to determine whether the generated image has been processed by the signal function. - **Hypothesis testing**: Use statistical hypothesis testing to determine whether a given model has used protected data during training or fine - tuning. ### Experimental results: - **Detection accuracy**: The experimental results show that this method can effectively detect unauthorized data use under multiple models (such as Stable Diffusion and VQ Diffusion) and different training or fine - tuning methods (such as LoRA, DreamBooth and standard training), with a detection accuracy of up to 100%. - **Impact on generation quality**: This method has a relatively small impact on the quality of images generated by the model, and the "coated images" are very close to the original images. ### Contributions: 1. **Defined two element - level injected memories**: Sample - level memory and memory under trigger conditions. 2. **Proposed a framework**: Detect unauthorized data use by injecting memory on protected datasets. 3. **Extensive experimental verification**: Experiments were carried out on multiple datasets and mainstream text - to - image diffusion models, proving the effectiveness of the method. In conclusion, the paper proposes an innovative method that effectively detects the problem of unauthorized data use in text - to - image diffusion models by injecting unique content into protected datasets.

DIAGNOSIS: Detecting Unauthorized Data Usages in Text-to-image Diffusion Models

Exploiting Watermark-Based Defense Mechanisms in Text-to-Image Diffusion Models for Unauthorized Data Usage

EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations

Text-to-Image Diffusion Models can be Easily Backdoored through Multimodal Data Poisoning

The Stronger the Diffusion Model, the Easier the Backdoor: Data Poisoning to Induce Copyright Breaches Without Adjusting Finetuning Pipeline

Unveiling and Mitigating Memorization in Text-to-image Diffusion Models through Cross Attention

Detecting Dataset Abuse in Fine-Tuning Stable Diffusion Models for Text-to-Image Synthesis

Unlearnable Examples for Diffusion Models: Protect Data from Unauthorized Exploitation

Revealing the Unseen: Guiding Personalized Diffusion Models to Expose Training Data

On Copyright Risks of Text-to-Image Diffusion Models

Towards Reliable Verification of Unauthorized Data Usage in Personalized Text-to-Image Diffusion Models

Defending Text-to-image Diffusion Models: Surprising Efficacy of Textual Perturbations Against Backdoor Attacks

A Dataset and Benchmark for Copyright Infringement Unlearning from Text-to-Image Diffusion Models

Toward Robust Imperceptible Perturbation against Unauthorized Text-to-image Diffusion-based Synthesis

EditShield: Protecting Unauthorized Image Editing by Instruction-guided Diffusion Models

Watermarking for Stable Diffusion Models

Unveiling Structural Memorization: Structural Membership Inference Attack for Text-to-Image Diffusion Models

Understanding and Mitigating Copying in Diffusion Models

FreezeAsGuard: Mitigating Illegal Adaptation of Diffusion Models via Selective Tensor Freezing

Perturbing Attention Gives You More Bang for the Buck: Subtle Imaging Perturbations That Efficiently Fool Customized Diffusion Models