PII-Compass: Guiding LLM training data extraction prompts towards the target PII via grounding

Krishna Kanth Nakka,Ahmed Frikha,Ricardo Mendes,Xue Jiang,Xuebing Zhou
2024-07-03
Abstract:The latest and most impactful advances in large models stem from their increased size. Unfortunately, this translates into an improved memorization capacity, raising data privacy concerns. Specifically, it has been shown that models can output personal identifiable information (PII) contained in their training data. However, reported PIII extraction performance varies widely, and there is no consensus on the optimal methodology to evaluate this risk, resulting in underestimating realistic adversaries. In this work, we empirically demonstrate that it is possible to improve the extractability of PII by over ten-fold by grounding the prefix of the manually constructed extraction prompt with in-domain data. Our approach, PII-Compass, achieves phone number extraction rates of 0.92%, 3.9%, and 6.86% with 1, 128, and 2308 queries, respectively, i.e., the phone number of 1 person in 15 is extractable.
Cryptography and Security,Artificial Intelligence,Computation and Language,Machine Learning
What problem does this paper attempt to address?
The paper aims to address the issue of Personally Identifiable Information (PII) leakage in Large Language Models (LLMs). Specifically, the researchers found that existing PII extraction methods exhibit significant performance differences and lack an optimal method for assessing this risk, leading to an underestimation of the threat posed by actual attackers. The paper proposes the PII-Compass method, which significantly improves the extraction rate of PII by utilizing real prefixes related to the target data subject to construct extraction prompts. Experimental results show that this method increases the extraction success rate by more than 10 times compared to simple manually constructed prompts, especially in black-box access scenarios. By combining manual prompts with real prefixes from other data subjects, the extraction rate of phone numbers can be increased to nearly 7%, meaning that the phone number of 1 in every 15 people can be extracted.