PersonaMark: Personalized LLM watermarking for model protection and user attribution

Yuehan Zhang,Peizhuo Lv,Yinpeng Liu,Yongqiang Ma,Wei Lu,Xiaofeng Wang,Xiaozhong Liu,Jiawei Liu

2024-09-15

Abstract:The rapid development of LLMs brings both convenience and potential threats. As costumed and private LLMs are widely applied, model copyright protection has become important. Text watermarking is emerging as a promising solution to AI-generated text detection and model protection issues. However, current text watermarks have largely ignored the critical need for injecting different watermarks for different users, which could help attribute the watermark to a specific individual. In this paper, we explore the personalized text watermarking scheme for LLM copyright protection and other scenarios, ensuring accountability and traceability in content generation. Specifically, we propose a novel text watermarking method PersonaMark that utilizes sentence structure as the hidden medium for the watermark information and optimizes the sentence-level generation algorithm to minimize disruption to the model's natural generation process. By employing a personalized hashing function to inject unique watermark signals for different users, personalized watermarked text can be obtained. Since our approach performs on sentence level instead of token probability, the text quality is highly preserved. The injection process of unique watermark signals for different users is time-efficient for a large number of users with the designed multi-user hashing function. As far as we know, we achieved personalized text watermarking for the first time through this. We conduct an extensive evaluation of four different LLMs in terms of perplexity, sentiment polarity, alignment, readability, etc. The results demonstrate that our method maintains performance with minimal perturbation to the model's behavior, allows for unbiased insertion of watermark information, and exhibits strong watermark recognition capabilities.

Cryptography and Security,Computation and Language

What problem does this paper attempt to address?

### The Problem Addressed by the Paper The paper primarily explores the issues of copyright protection and personal attribution in personalized large language models (LLMs). Specifically, the paper proposes a new text watermarking method—PersonaMark, to address the following problems: 1. **Copyright Protection**: - With the widespread application of customized and proprietary LLMs, copyright protection has become very important. Traditional text watermarking methods have shortcomings in terms of copyright protection, especially in the context of personalized watermarks for different users. 2. **Personalized Watermarking**: - Current text watermarking technologies mostly focus on detecting AI-generated text and embedding simple binary watermark signals, which cannot generate unique watermarks for different users. Therefore, the need for personalized watermarking has become urgent. 3. **Maintaining Text Quality**: - Existing watermarking methods embed watermarks by manipulating word probabilities, which introduces biases and affects text quality. PersonaMark embeds watermarks through syntactic structures, thereby avoiding impacts on text quality. 4. **User Attribution and Tracking**: - In personalized LLMs, it is necessary to ensure that the content generated by each user can be accurately traced back to a specific user ID. Existing methods have limited capabilities in this regard, while PersonaMark achieves this goal through personalized hash functions. Through these means, the paper aims to provide a reliable and efficient copyright protection mechanism for personalized LLMs, while ensuring the accuracy of text quality and user attribution.

PersonaMark: Personalized LLM watermarking for model protection and user attribution

MarkLLM: An Open-Source Toolkit for LLM Watermarking

Can Watermarked LLMs be Identified by Users via Crafted Prompts?

PostMark: A Robust Blackbox Watermark for Large Language Models

Universally Optimal Watermarking Schemes for LLMs: from Theory to Practice

PLMmark: A Secure and Robust Black-Box Watermarking Framework for Pre-trained Language Models

Towards Codable Watermarking for Injecting Multi-bits Information to LLMs

Segmenting Watermarked Texts From Language Models

Signal Watermark on Large Language Models

Mark My Words: Analyzing and Evaluating Language Model Watermarks

Adaptive Text Watermark for Large Language Models

Watermarking Text Generated by Black-Box Language Models

Topic-Based Watermarks for LLM-Generated Text

Learning to Watermark LLM-generated Text via Reinforcement Learning

FreqMark: Frequency-Based Watermark for Sentence-Level Detection of LLM-Generated Text

Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models

WaterPark: A Robustness Assessment of Language Model Watermarking

Watermarking Text Data on Large Language Models for Dataset Copyright

REMARK-LLM: A Robust and Efficient Watermarking Framework for Generative Large Language Models

Countering Personalized Text-to-Image Generation with Influence Watermarks

Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data?