Abstract:Existing methods have evolved from using synonym substitution to incorporating arbitrary word substitution to achieve reversible natural language watermarking. However, a notable limitation is that they are prone to overlook the sensitivity of information associated with the original words, with a tendency to prefer non-sensitive words for substitution. As a result, a potential risk of sensitive information leakage contained in the original text is posed. Furthermore, while aiming for reversibility, the overall performance of the watermarking method may be inadvertently compromised. In response to the above problems, this paper puts forward a novel reversible natural language watermarking method that combines a K eyword S ubstitution scheme and a P rediction E rror E xpansion algorithm (KSPEE) to protect sensitive information, verify content integrity, protect copyright, and so on. Specifically, KSPEE leverages a keyword extraction algorithm to identify important content containing sensitive information in the original text, thereby determining the potential positions for watermark information embedding. Subsequently, a masked language model is utilized to predict appropriate substitution words based on the surrounding semantic information of the embedding position . In addition, the prediction error expansion algorithm is employed to select appropriate words for substituting the original keywords, ensuring the successful embedding of watermark information while maintaining the recoverability of the original keywords. By identifying keywords and substituting them, a suitable method of protecting the original sensitive information is provided. Extensive experiments demonstrate that, under the promise of semantic distortion and lossless restoration of the original content, the proposed method KSPEE achieves outstanding watermarked text quality. A higher watermark embedding rate is achieved and strong security is shown by KSPEE. More importantly, KSPEE effectively prevents the leakage of sensitive information.

Reversible source-aware natural language watermarking via customized lexical substitution

A reversible natural language watermarking for sensitive information protection

Tracing Text Provenance Via Context-Aware Lexical Substitution

WatME: Towards Lossless Watermarking Through Lexical Redundancy

A Novel Scheme for Watermarking Natural Language Text

On the Reliability of Watermarks for Large Language Models

Protecting Your NLG Models with Semantic and Robust Watermarks

Towards Codable Watermarking for Injecting Multi-bits Information to LLMs

WaterPark: A Robustness Assessment of Language Model Watermarking

COSYWA: Enhancing Semantic Integrity in Watermarking Natural Language Generation.

Topic-Based Watermarks for LLM-Generated Text

Large Language Model Watermark Stealing With Mixed Integer Programming

Segmenting Watermarked Texts From Language Models

Learning to Watermark LLM-generated Text via Reinforcement Learning

A novel watermarking framework for intellectual property protection of NLG APIs.

Towards Codable Text Watermarking for Large Language Models

Protecting Intellectual Property of Language Generation APIs with Lexical Watermark

Robust Multi-bit Text Watermark with LLM-based Paraphrasers

Can Watermarked LLMs be Identified by Users via Crafted Prompts?

Necessary and Sufficient Watermark for Large Language Models

Universally Optimal Watermarking Schemes for LLMs: from Theory to Practice