Detecting Dataset Abuse in Fine-Tuning Stable Diffusion Models for Text-to-Image Synthesis

Songrui Wang,Yubo Zhu,Wei Tong,Sheng Zhong
2024-09-28
Abstract:Text-to-image synthesis has become highly popular for generating realistic and stylized images, often requiring fine-tuning generative models with domain-specific datasets for specialized tasks. However, these valuable datasets face risks of unauthorized usage and unapproved sharing, compromising the rights of the owners. In this paper, we address the issue of dataset abuse during the fine-tuning of Stable Diffusion models for text-to-image synthesis. We present a dataset watermarking framework designed to detect unauthorized usage and trace data leaks. The framework employs two key strategies across multiple watermarking schemes and is effective for large-scale dataset authorization. Extensive experiments demonstrate the framework's effectiveness, minimal impact on the dataset (only 2% of the data required to be modified for high detection accuracy), and ability to trace data leaks. Our results also highlight the robustness and transferability of the framework, proving its practical applicability in detecting dataset abuse.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper attempts to address the problem of dataset abuse when fine - tuning the Stable Diffusion model for text - to - image synthesis. Specifically, the author focuses on how to detect unauthorized use and track the source of data leakage. The following is a detailed description of the problem: ### 1. **Background and Motivation** Text - to - image synthesis has become very popular in recent years and can generate realistic and stylized images. In order to achieve high - quality synthesis for specific tasks, it is usually necessary to fine - tune the generative model using domain - specific datasets. However, these valuable datasets are at risk of unauthorized use and unapproved sharing, which violates the rights of data owners. ### 2. **Problem Description** Dataset abuse mainly involves two scenarios: 1. **Improper use by authorized entities**: For example, a dataset is authorized for an advertising generation model but is used to create fake news about a specific topic. 2. **Unauthorized sharing**: Authorized users may share the dataset with other parties for profit without obtaining proper authorization or consent, thus violating the rights and interests of data owners. ### 3. **Research Objectives** The author aims to develop a framework that can detect dataset abuse and track the source of data leakage when fine - tuning the Stable Diffusion model. Specific objectives include: - **Accuracy**: Be able to accurately determine whether a dataset has been used to train a text - to - image synthesis model, thereby detecting data abuse and identifying the source of leakage. - **Harmlessness**: The injected watermark should not affect the normal use of data, ensuring that the trained model can still generate high - quality images. - **Traceability**: Be able to track the source of data and identify unauthorized users. - **Concealment**: The amount of modified data should be as small as possible, and the modification should be as unnoticeable as possible to prevent malicious users from detecting and bypassing protection measures. ### 4. **Solutions** To this end, the author proposes a new dataset watermarking framework, which achieves the above - mentioned goals in the following ways: - **Activation Token Selection and Watermark Injection**: Select specific text tokens as activation tokens and embed watermarks in the images related to these tokens. This ensures that when the specified token appears in the input text, the model will generate an image with watermark features. - **Watermark Detection**: Develop a discriminator to detect watermark patterns in the images generated by the text - to - image model. By using activation tokens to prompt the suspected model to generate images and using the discriminator to detect watermarks in these images, it can be determined whether the data has been abused. ### 5. **Experimental Verification** The author conducted extensive experiments to verify the effectiveness, harmlessness, traceability, and robustness of this framework. The experimental results show that this method can achieve high - precision detection by modifying only about 2% of the data and has good transferability across datasets and fine - tuning methods. ### Summary The main contributions of this paper are: - In - depth study of the problem of dataset abuse in the process of fine - tuning text - to - image models and exploration of effective detection methods. - Proposing a dataset watermarking framework, including token and watermark injection and watermark detection stages, enabling data owners to detect potential data abuse. - Verifying the effectiveness, harmlessness, traceability, and robustness of this framework through experiments, demonstrating its practical application value. Regarding formulas, the paper does not involve complex mathematical formulas, mainly descriptive content. If there is a specific formula requirement, it can be further supplemented according to the context.