DIAGNOSIS: Detecting Unauthorized Data Usages in Text-to-image Diffusion Models

Zhenting Wang,Chen Chen,Lingjuan Lyu,Dimitris N. Metaxas,Shiqing Ma
2024-04-10
Abstract:Recent text-to-image diffusion models have shown surprising performance in generating high-quality images. However, concerns have arisen regarding the unauthorized data usage during the training or fine-tuning process. One example is when a model trainer collects a set of images created by a particular artist and attempts to train a model capable of generating similar images without obtaining permission and giving credit to the artist. To address this issue, we propose a method for detecting such unauthorized data usage by planting the injected memorization into the text-to-image diffusion models trained on the protected dataset. Specifically, we modify the protected images by adding unique contents on these images using stealthy image warping functions that are nearly imperceptible to humans but can be captured and memorized by diffusion models. By analyzing whether the model has memorized the injected content (i.e., whether the generated images are processed by the injected post-processing function), we can detect models that had illegally utilized the unauthorized data. Experiments on Stable Diffusion and VQ Diffusion with different model training or fine-tuning methods (i.e, LoRA, DreamBooth, and standard training) demonstrate the effectiveness of our proposed method in detecting unauthorized data usages. Code: <a class="link-external link-https" href="https://github.com/ZhentingWang/DIAGNOSIS" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Cryptography and Security,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to detect unauthorized data use during the training or fine - tuning of text - to - image diffusion models. Specifically, unauthorized data use occurs when a model trainer collects a set of images created by a particular artist and attempts to train a model capable of generating similar images without obtaining permission and giving credit to the artist. To address this issue, the paper proposes a method to detect such unauthorized data use by injecting memory into text - to - image diffusion models on protected datasets. ### Specific problems solved by the paper: 1. **Unauthorized data use**: The paper focuses on how to detect unauthorized data use when training or fine - tuning text - to - image diffusion models. For example, a model trainer may collect an artist's portfolio and attempt to train a model capable of generating similar images without obtaining the artist's permission. 2. **Detection method of injecting memory**: The paper proposes a method to modify protected images by adding unique, almost imperceptible content (called "signal function") to the images. These modified images can be captured and remembered by the diffusion model. By analyzing whether the model has remembered these injected contents, it can be detected whether the model has illegally used unauthorized data. ### Method overview: - **Signal function**: Add unique content to protected images by applying a secret image distortion function called "signal function" on the images. These contents are almost invisible to humans but can be captured and remembered by the diffusion model. - **Signal classifier**: Train a binary classifier (called "signal classifier") to determine whether the generated image has been processed by the signal function. - **Hypothesis testing**: Use statistical hypothesis testing to determine whether a given model has used protected data during training or fine - tuning. ### Experimental results: - **Detection accuracy**: The experimental results show that this method can effectively detect unauthorized data use under multiple models (such as Stable Diffusion and VQ Diffusion) and different training or fine - tuning methods (such as LoRA, DreamBooth and standard training), with a detection accuracy of up to 100%. - **Impact on generation quality**: This method has a relatively small impact on the quality of images generated by the model, and the "coated images" are very close to the original images. ### Contributions: 1. **Defined two element - level injected memories**: Sample - level memory and memory under trigger conditions. 2. **Proposed a framework**: Detect unauthorized data use by injecting memory on protected datasets. 3. **Extensive experimental verification**: Experiments were carried out on multiple datasets and mainstream text - to - image diffusion models, proving the effectiveness of the method. In conclusion, the paper proposes an innovative method that effectively detects the problem of unauthorized data use in text - to - image diffusion models by injecting unique content into protected datasets.