Abstract:Large language models (LLMs) have shown considerable success in a range of domain-specific tasks, especially after fine-tuning. However, fine-tuning with real-world data usually leads to privacy risks, particularly when the fine-tuning samples exist in the pre-training data. To avoid the shortcomings of real data, developers often employ methods to automatically generate synthetic data for fine-tuning, as data generated by traditional models are often far away from the real-world pertaining data. However, given the advanced capabilities of LLMs, the distinction between real data and LLM-generated data has become negligible, which may also lead to privacy risks like real data. In this paper, we present an empirical analysis of this underexplored issue by investigating a key question: "Does fine-tuning with LLM-generated data enhance privacy, or does it pose additional privacy risks?" Based on the structure of LLM's generated data, our research focuses on two primary approaches to fine-tuning with generated data: supervised fine-tuning with unstructured generated data and self-instruct tuning. The number of successful Personal Information Identifier (PII) extractions for Pythia after fine-tuning our generated data raised over $20\%$. Furthermore, the ROC-AUC score of membership inference attacks for Pythia-6.9b after self-instruct methods also achieves more than $40\%$ improvements on ROC-AUC score than base models. The results indicate the potential privacy risks in LLMs when fine-tuning with the generated data.

What problem does this paper attempt to address?

### Problems the paper attempts to solve This paper aims to explore and evaluate whether privacy will be enhanced or additional privacy risks will be introduced when fine - tuning with data generated by large language models (LLMs). Specifically, the research mainly focuses on the following two aspects: 1. **Supervised fine - tuning of unstructured generated data**: - Researchers experimentally verified whether the success rate of personally identifiable information (PII) extraction attacks increased after fine - tuning the Pythia model with generated data in the email domain. - The experimental results showed that the success rate of the fine - tuned model in PII extraction attacks increased significantly. In particular, when fine - tuning on the generated email data, the success rate increased by more than 20%. 2. **Self - Instruct Tuning**: - Through law - related tasks, researchers evaluated the risk of membership inference attacks (MIA) after fine - tuning with data generated by the self - instruction method. - The experimental results showed that the MIA AUC - ROC score of the Pythia - 6.9b model after self - instruction fine - tuning on the FreeLaw dataset was about 20% higher than that of the pre - trained model, indicating that self - instruction fine - tuning may exacerbate privacy risks. ### Key issues and findings - **Amplification of privacy risks**: Whether fine - tuning with unstructured generated data or self - generated data may lead to an increase in privacy risks. Especially when the generated data is of high quality and similar to the pre - training data, this risk is more obvious. - **Influence of the quality and template of generated data**: Research shows that the template and quality of generated data are the main factors affecting the success rate of PII extraction. High - quality generated data may cause the model to more easily remember sensitive information, thereby increasing the risk of privacy leakage. - **Influence of model size**: Larger models show a higher PII extraction success rate after fine - tuning, which is attributed to the fact that larger models have stronger representation capabilities and can more easily remember training data. ### Conclusion This paper reveals that although fine - tuning with generated data can improve the performance of the model in specific fields, it may also bring serious privacy risks. Therefore, researchers need to handle the quality and structure of generated data more carefully to reduce potential privacy leakage problems.

Generated Data with Fake Privacy: Hidden Dangers of Fine-tuning Large Language Models on Generated Data

Large Language Models Can Be Good Privacy Protection Learners

User Inference Attacks on Large Language Models

The Janus Interface: How Fine-Tuning in Large Language Models Amplifies the Privacy Risks

Empirical Privacy Evaluations of Generative and Predictive Machine Learning Models -- A review and challenges for practice

Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage

PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners

Does fine-tuning GPT-3 with the OpenAI API leak personally-identifiable information?

Pandora's White-Box: Precise Training Data Detection and Extraction in Large Language Models

Large Language Models Can Be Contextual Privacy Protection Learners

On Active Privacy Auditing in Supervised Fine-tuning for White-Box Language Models

Generating Artificial Data for Private Deep Learning

Tunable Privacy Risk Evaluation of Generative Adversarial Networks

Robust Privacy Amidst Innovation with Large Language Models Through a Critical Assessment of the Risks

SoK: Reducing the Vulnerability of Fine-tuned Language Models to Membership Inference Attacks

Privacy-Preserving Instructions for Aligning Large Language Models

Analyzing Leakage of Personally Identifiable Information in Language Models

LLM-PBE: Assessing Data Privacy in Large Language Models

Private prediction for large-scale synthetic text generation

Evaluating Differentially Private Synthetic Data Generation in High-Stakes Domains

PreCurious: How Innocent Pre-Trained Language Models Turn into Privacy Traps