Mosaic Memory: Fuzzy Duplication in Copyright Traps for Large Language Models

Igor Shilov,Matthieu Meeus,Yves-Alexandre de Montjoye
2024-05-24
Abstract:The immense datasets used to develop Large Language Models (LLMs) often include copyright-protected content, typically without the content creator's consent. Copyright traps have been proposed to be injected into the original content, improving content detectability in newly released LLMs. Traps, however, rely on the exact duplication of a unique text sequence, leaving them vulnerable to commonly deployed data deduplication techniques. We here propose the generation of fuzzy copyright traps, featuring slight modifications across duplication. When injected in the fine-tuning data of a 1.3B LLM, we show fuzzy trap sequences to be memorized nearly as well as exact duplicates. Specifically, the Membership Inference Attack (MIA) ROC AUC only drops from 0.90 to 0.87 when 4 tokens are replaced across the fuzzy duplicates. We also find that selecting replacement positions to minimize the exact overlap between fuzzy duplicates leads to similar memorization, while making fuzzy duplicates highly unlikely to be removed by any deduplication process. Lastly, we argue that the fact that LLMs memorize across fuzzy duplicates challenges the study of LLM memorization relying on naturally occurring duplicates. Indeed, we find that the commonly used training dataset, The Pile, contains significant amounts of fuzzy duplicates. This introduces a previously unexplored confounding factor in post-hoc studies of LLM memorization, and questions the effectiveness of (exact) data deduplication as a privacy protection technique.
Computation and Language,Machine Learning
What problem does this paper attempt to address?
The paper primarily addresses the issue of how to detect and utilize fuzzy duplicate sequences during the training of large language models (LLMs) to improve the effectiveness of copyright traps and explore the impact of these fuzzy duplicate sequences on model memory behavior. Specifically, the paper attempts to solve the following key issues: 1. **Copyright Protection**: How to effectively inject synthetic copyright traps that are not easily removed by data deduplication techniques, in order to track and detect whether the model has been trained using copyrighted content. 2. **Model Memory Mechanism Research**: Experimentally verify the memory effect of fuzzy duplicate sequences in large language models, exploring whether the model has "mosaic memory," i.e., the ability to remember a sequence through partially overlapping fragments. 3. **Limitations of Existing Research**: Reveal potential biases in current research methods based on naturally occurring duplicates, suggesting that these studies may overestimate the impact of exact duplicates on model memory while neglecting the role of fuzzy duplicate sequences. 4. **Evaluation of Privacy Protection Measures**: Question the effectiveness of existing data deduplication techniques as a means of privacy protection, especially when dealing with fuzzy duplicate sequences. By introducing and studying fuzzy copyright traps, the paper not only demonstrates the feasibility and effectiveness of this technique in practical applications but also highlights potential significant confounding factors in current LLM memory research.