IncogniText: Privacy-enhancing Conditional Text Anonymization via LLM-based Private Attribute Randomization

Ahmed Frikha,Nassim Walha,Krishna Kanth Nakka,Ricardo Mendes,Xue Jiang,Xuebing Zhou
2024-07-03
Abstract:In this work, we address the problem of text anonymization where the goal is to prevent adversaries from correctly inferring private attributes of the author, while keeping the text utility, i.e., meaning and semantics. We propose IncogniText, a technique that anonymizes the text to mislead a potential adversary into predicting a wrong private attribute value. Our empirical evaluation shows a reduction of private attribute leakage by more than 90%. Finally, we demonstrate the maturity of IncogniText for real-world applications by distilling its anonymization capability into a set of LoRA parameters associated with an on-device model.
Cryptography and Security,Artificial Intelligence,Computation and Language,Machine Learning
What problem does this paper attempt to address?
The paper attempts to address the problem of preventing adversaries from correctly inferring the author's private attributes (such as age, gender, etc.) while preserving the meaning and semantics of the text. Specifically, the paper proposes a technique called IncogniText, which rewrites the text to mislead potential adversaries into incorrectly predicting private attribute values. Experimental results show that this method can reduce private attribute leakage by over 90%. Additionally, the paper demonstrates how the capabilities of IncogniText can be distilled into a set of LoRA parameters for use in small models running on devices, making it suitable for practical applications.