TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization Methods

Gabriel Loiseau,Damien Sileo,Damien Riquet,Maxime Meyer,Marc Tommasi
2024-07-31
Abstract:Authorship obfuscation aims to disguise the identity of an author within a text by altering the writing style, vocabulary, syntax, and other linguistic features associated with the text author. This alteration needs to balance privacy and utility. While strong obfuscation techniques can effectively hide the author's identity, they often degrade the quality and usefulness of the text for its intended purpose. Conversely, maintaining high utility tends to provide insufficient privacy, making it easier for an adversary to de-anonymize the author. Thus, achieving an optimal trade-off between these two conflicting objectives is crucial. In this paper, we propose TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization, a new unsupervised authorship obfuscation method whose goal is to optimize the privacy-utility trade-off by regenerating the entire text considering its downstream utility. Our approach leverages policy optimization as a fine-tuning paradigm over small language models in order to rewrite texts by preserving author identity and downstream task utility. We show that our approach largely reduce the accuracy of attackers while preserving utility. We make our code and models publicly available.
Computation and Language
What problem does this paper attempt to address?
The paper primarily focuses on addressing the issue of author identity privacy protection in texts, specifically how to modify texts to hide the original author's identity without significantly affecting the text's utility. Specifically, the research aims to achieve the following goals: 1. **Balancing Privacy and Usability**: The goal of Authorship Obfuscation (AO) is to modify features such as text style, vocabulary, and syntax while maintaining the text's usability and purpose. Excessive modification can damage the text's quality and utility, while insufficient modification cannot effectively protect the author's privacy. 2. **Task-Oriented Authorship Obfuscation**: A new method called TAROT (Task-Oriented Authorship Obfuscation Using Policy Optimization) is proposed. This method uses policy optimization algorithms to adjust the text to maximize privacy protection while retaining the utility of downstream tasks. 3. **Addressing Limitations of Existing Methods**: Existing authorship obfuscation methods often focus on minimizing changes to the text content to maintain semantic coherence and meaning, but this approach is often insufficient to counter real-world attack scenarios. ### Main Contributions 1. **Framework Design**: A new framework is designed for task-oriented authorship obfuscation, guiding generative models through policy optimization algorithms to maximize privacy protection while retaining the utility of downstream tasks. 2. **Model Proposal**: The TAROT model is proposed, an unsupervised method that can obfuscate text without prior knowledge of the author's information and can maintain text utility across various tasks. The model has two versions, TAROT-PPO and TAROT-DPO, based on different policy optimization algorithms. 3. **Evaluation Experiments**: Evaluation experiments were conducted on three different datasets, including movie reviews, blog posts, and academic documents, demonstrating that TAROT can perform authorship obfuscation for different tasks across different datasets while protecting the author's identity. Through the above work, the paper aims to provide a method that can effectively protect the author's privacy while maintaining the text's utility, achieving a good balance between privacy protection and text usability.