Abstract:Existing methods have evolved from using synonym substitution to incorporating arbitrary word substitution to achieve reversible natural language watermarking. However, a notable limitation is that they are prone to overlook the sensitivity of information associated with the original words, with a tendency to prefer non-sensitive words for substitution. As a result, a potential risk of sensitive information leakage contained in the original text is posed. Furthermore, while aiming for reversibility, the overall performance of the watermarking method may be inadvertently compromised. In response to the above problems, this paper puts forward a novel reversible natural language watermarking method that combines a K eyword S ubstitution scheme and a P rediction E rror E xpansion algorithm (KSPEE) to protect sensitive information, verify content integrity, protect copyright, and so on. Specifically, KSPEE leverages a keyword extraction algorithm to identify important content containing sensitive information in the original text, thereby determining the potential positions for watermark information embedding. Subsequently, a masked language model is utilized to predict appropriate substitution words based on the surrounding semantic information of the embedding position . In addition, the prediction error expansion algorithm is employed to select appropriate words for substituting the original keywords, ensuring the successful embedding of watermark information while maintaining the recoverability of the original keywords. By identifying keywords and substituting them, a suitable method of protecting the original sensitive information is provided. Extensive experiments demonstrate that, under the promise of semantic distortion and lossless restoration of the original content, the proposed method KSPEE achieves outstanding watermarked text quality. A higher watermark embedding rate is achieved and strong security is shown by KSPEE. More importantly, KSPEE effectively prevents the leakage of sensitive information.

Natural Language Watermarking Via Paraphraser-Based Lexical Substitution.

Tracing Text Provenance Via Context-Aware Lexical Substitution

Reversible source-aware natural language watermarking via customized lexical substitution

Robust Multi-bit Text Watermark with LLM-based Paraphrasers

Steganalysis of Synonym-Substitution Based Natural Language Watermarking

A Robust Semantics-based Watermark for Large Language Model against Paraphrasing

A Novel Scheme for Watermarking Natural Language Text

Protecting Your NLG Models with Semantic and Robust Watermarks

On the Reliability of Watermarks for Large Language Models

Resilient Natural Language Watermarking Based on Pragmatics

Protecting Intellectual Property of Language Generation APIs with Lexical Watermark

A novel watermarking framework for intellectual property protection of NLG APIs.

Context-aware Watermark with Semantic Balanced Green-red Lists for Large Language Models

Revisiting the Robustness of Watermarking to Paraphrasing Attacks

COSYWA: Enhancing Semantic Integrity in Watermarking Natural Language Generation.

WatME: Towards Lossless Watermarking Through Lexical Redundancy

Mark My Words: Analyzing and Evaluating Language Model Watermarks

Necessary and Sufficient Watermark for Large Language Models

A reversible natural language watermarking for sensitive information protection

SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation

Watermarking Text Generated by Black-Box Language Models