PSY: Posterior Sampling Based Privacy Enhancer in Large Language Models

Yulian Sun,Li Duan,Yong Li
2024-10-24
Abstract:Privacy vulnerabilities in LLMs, such as leakage from memorization, have been constantly identified, and various mitigation proposals have been proposed. LoRA is usually used in fine-tuning LLMs and a good entry point to insert privacy-enhancing modules. In this ongoing research, we introduce PSY, a Posterior Sampling based PrivacY enhancer that can be used in LoRA. We propose a simple yet effective realization of PSY using posterior sampling, which effectively prevents privacy leakage from intermediate information and, in turn, preserves the privacy of data owners. We evaluate LoRA extended with PSY against state-of-the-art membership inference and data extraction attacks. The experiments are executed on three different LLM architectures fine-tuned on three datasets with LoRA. In contrast to the commonly used differential privacy method, we find that our proposed modification consistently reduces the attack success rate. Meanwhile, our method has almost no negative impact on model fine-tuning or final performance. Most importantly, PSY reveals a promising path toward privacy enhancement with latent space extensions.
Cryptography and Security
What problem does this paper attempt to address?
This paper attempts to address the privacy leakage problem in large - language models (LLMs) during the fine - tuning process due to the memorization of training data. Specifically, LLMs may remember sensitive information in the training and fine - tuning datasets, and attackers can exploit this during inference to recover such information. This not only threatens users' privacy but also poses a risk to the security of public service platforms. To solve this problem, the authors propose a new module named PSY (Posterior Sampling based Privacy Enhancer). Based on the posterior sampling technique, PSY can effectively prevent the leakage of intermediate information, thereby protecting the privacy of data owners. PSY is especially suitable for LoRA (Low - Rank Adaptation), an efficient fine - tuning method that can adjust the model without updating all parameters. ### Main Contributions 1. **Proposing the PSY module**: By inserting a posterior sampling layer in LoRA, PSY can effectively alleviate the LLMs' memorization of fine - tuning data and reduce the risk of privacy leakage. 2. **Experimental verification**: The authors conducted experiments on three different LLM architectures and evaluated PSY's performance in countering the latest membership inference attacks (MIA) and data extraction attacks (DEA). The experimental results show that PSY can not only significantly reduce the attack success rate but also hardly have a negative impact on the model's fine - tuning or final performance. 3. **Potential applications**: PSY demonstrates the feasibility of enhancing privacy protection by expanding the latent space, providing new ideas for future research. ### Formula Representation The formulas involved in this paper mainly describe the working principle of PSY: - Output of the posterior sampling layer: \[ z=\mu(Ax)+\epsilon\Sigma(Ax) \] where \(z\) is the latent vector sampled from the mean \(\mu(Ax)\) and covariance \(\Sigma(Ax)\), and \(\epsilon\) is random noise. - Modified forward propagation: \[ h = W_0x + Bz=W_0x + B[\mu(Ax)+\epsilon\Sigma(Ax)] \] Through these formulas, PSY can effectively improve the level of privacy protection while maintaining the model's performance.