Generated Data with Fake Privacy: Hidden Dangers of Fine-tuning Large Language Models on Generated Data

Atilla Akkus,Mingjie Li,Junjie Chu,Michael Backes,Yang Zhang,Sinem Sav
2024-09-12
Abstract:Large language models (LLMs) have shown considerable success in a range of domain-specific tasks, especially after fine-tuning. However, fine-tuning with real-world data usually leads to privacy risks, particularly when the fine-tuning samples exist in the pre-training data. To avoid the shortcomings of real data, developers often employ methods to automatically generate synthetic data for fine-tuning, as data generated by traditional models are often far away from the real-world pertaining data. However, given the advanced capabilities of LLMs, the distinction between real data and LLM-generated data has become negligible, which may also lead to privacy risks like real data. In this paper, we present an empirical analysis of this underexplored issue by investigating a key question: "Does fine-tuning with LLM-generated data enhance privacy, or does it pose additional privacy risks?" Based on the structure of LLM's generated data, our research focuses on two primary approaches to fine-tuning with generated data: supervised fine-tuning with unstructured generated data and self-instruct tuning. The number of successful Personal Information Identifier (PII) extractions for Pythia after fine-tuning our generated data raised over $20\%$. Furthermore, the ROC-AUC score of membership inference attacks for Pythia-6.9b after self-instruct methods also achieves more than $40\%$ improvements on ROC-AUC score than base models. The results indicate the potential privacy risks in LLMs when fine-tuning with the generated data.
Cryptography and Security,Machine Learning
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to explore and evaluate whether privacy will be enhanced or additional privacy risks will be introduced when fine - tuning with data generated by large language models (LLMs). Specifically, the research mainly focuses on the following two aspects: 1. **Supervised fine - tuning of unstructured generated data**: - Researchers experimentally verified whether the success rate of personally identifiable information (PII) extraction attacks increased after fine - tuning the Pythia model with generated data in the email domain. - The experimental results showed that the success rate of the fine - tuned model in PII extraction attacks increased significantly. In particular, when fine - tuning on the generated email data, the success rate increased by more than 20%. 2. **Self - Instruct Tuning**: - Through law - related tasks, researchers evaluated the risk of membership inference attacks (MIA) after fine - tuning with data generated by the self - instruction method. - The experimental results showed that the MIA AUC - ROC score of the Pythia - 6.9b model after self - instruction fine - tuning on the FreeLaw dataset was about 20% higher than that of the pre - trained model, indicating that self - instruction fine - tuning may exacerbate privacy risks. ### Key issues and findings - **Amplification of privacy risks**: Whether fine - tuning with unstructured generated data or self - generated data may lead to an increase in privacy risks. Especially when the generated data is of high quality and similar to the pre - training data, this risk is more obvious. - **Influence of the quality and template of generated data**: Research shows that the template and quality of generated data are the main factors affecting the success rate of PII extraction. High - quality generated data may cause the model to more easily remember sensitive information, thereby increasing the risk of privacy leakage. - **Influence of model size**: Larger models show a higher PII extraction success rate after fine - tuning, which is attributed to the fact that larger models have stronger representation capabilities and can more easily remember training data. ### Conclusion This paper reveals that although fine - tuning with generated data can improve the performance of the model in specific fields, it may also bring serious privacy risks. Therefore, researchers need to handle the quality and structure of generated data more carefully to reduce potential privacy leakage problems.