On Provable Copyright Protection for Generative Models

Nikhil Vyas,Sham Kakade,Boaz Barak
2023-07-22
Abstract:There is a growing concern that learned conditional generative models may output samples that are substantially similar to some copyrighted data $C$ that was in their training set. We give a formal definition of $\textit{near access-freeness (NAF)}$ and prove bounds on the probability that a model satisfying this definition outputs a sample similar to $C$, even if $C$ is included in its training set. Roughly speaking, a generative model $p$ is $\textit{$k$-NAF}$ if for every potentially copyrighted data $C$, the output of $p$ diverges by at most $k$-bits from the output of a model $q$ that $\textit{did not access $C$ at all}$. We also give generative model learning algorithms, which efficiently modify the original generative model learning algorithm in a black box manner, that output generative models with strong bounds on the probability of sampling protected content. Furthermore, we provide promising experiments for both language (transformers) and image (diffusion) generative models, showing minimal degradation in output quality while ensuring strong protections against sampling protected content.
Machine Learning
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve The paper "On Provable Copyright Protection for Generative Models" attempts to address the issue of copyright infringement by generative models when producing output content. Specifically, the paper focuses on the possibility that conditional generative models may encounter some copyrighted data during training, and this data may appear in a significantly similar form in the model's generated content. ### Background and Motivation 1. **Copyright Issues During Training**: - Generative models typically require a large amount of data for training, and this data may include copyrighted content. - Even if copyrighted data is removed during training, it may not be desirable to completely exclude it, as this data can help improve the quality of the model's generation. 2. **Copyright Issues During Deployment**: - After the model is deployed, when users provide prompts to obtain generated content, the generated content may be a direct copy or a significantly similar version of copyrighted data. - Since generative models do not record the source of their output, users cannot easily verify whether the generated content infringes on copyright. ### Main Contributions 1. **Formal Definition**: - Introduced the concept of "Near Access-Freeness" (NAF) to quantify the extent to which the output of a generative model is influenced by specific copyrighted data. - Defined a generative model \( p \) as \( k \)-NAF if, for each piece of copyrighted data \( C \), the difference between the output of model \( p \) and the output of a "safe" model \( q \) that has not accessed \( C \) does not exceed \( k \) bits of information. 2. **Algorithm Design**: - Proposed an algorithm that can transform any generative model learning algorithm \( A \) into a copyright-protecting algorithm \( A_k \). - The transformed model \( A_k \) produces content that is close in performance to the original model \( A \) but effectively reduces the risk of generating copyrighted content. 3. **Experimental Validation**: - Conducted experiments on language generation (e.g., Transformer models) and image generation (e.g., diffusion models) tasks, showing that the modified models significantly reduce the probability of generating copyrighted content while maintaining generation quality. ### Key Concepts - **Near Access-Freeness (NAF)**: - A generative model \( p \) is \( k \)-NAF if, for each piece of copyrighted data \( C \), the difference between the output of model \( p \) and the output of a "safe" model \( q \) that has not accessed \( C \) does not exceed \( k \) bits of information. - This difference is quantified using information-theoretic measures such as maximum KL divergence or KL divergence. - **Safe Model**: - A generative model that has not accessed copyrighted data \( C \), denoted as \( \text{safe}(C) \). ### Experimental Methods - **Leave-One-Out-Safe Method**: - Remove all samples containing copyrighted data \( C \) from the training set and then train the model. - This method may require training multiple models in practical applications, which is less efficient. - **Sharded Safe Method**: - Divide the training set into two disjoint parts and train two models \( q_1 \) and \( q_2 \) separately. - For each piece of copyrighted data \( C \), choose the model that has not been trained on \( C \) as the "safe" model. ### Conclusion The paper proposes a formal framework and algorithm for copyright protection in generative models. By introducing the concept of "Near Access-Freeness," the paper provides a way to quantify and reduce the influence of copyrighted data on the output of generative models.