Abstract:Protecting intellectual property (IP) of text such as articles and code is increasingly important, especially as sophisticated attacks become possible, such as paraphrasing by large language models (LLMs) or even unauthorized training of LLMs on copyrighted text to infringe such IP. However, existing text watermarking methods are not robust enough against such attacks nor scalable to millions of users for practical implementation. In this paper, we propose Waterfall, the first training-free framework for robust and scalable text watermarking applicable across multiple text types (e.g., articles, code) and languages supportable by LLMs, for general text and LLM data provenance. Waterfall comprises several key innovations, such as being the first to use LLM as paraphrasers for watermarking along with a novel combination of techniques that are surprisingly effective in achieving robust verifiability and scalability. We empirically demonstrate that Waterfall achieves significantly better scalability, robust verifiability, and computational efficiency compared to SOTA article-text watermarking methods, and showed how it could be directly applied to the watermarking of code. We also demonstrated that Waterfall can be used for LLM data provenance, where the watermarks of LLM training data can be detected in LLM output, allowing for detection of unauthorized use of data for LLM training and potentially enabling model-centric watermarking of open-sourced LLMs which has been a limitation of existing LLM watermarking works. Our code is available at <a class="link-external link-https" href="https://github.com/aoi3142/Waterfall" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper aims to solve the problem of intellectual property (IP) protection of text data, especially the plagiarism and unauthorized use of text content such as articles and code. Specifically, the paper focuses on the following problems: 1. **Limitations of existing text watermarking methods**: - Existing text watermarking methods are not robust enough in the face of complex attacks, such as synonym replacement or re - formulation by large language models (LLMs). - These methods also lack scalability when extended to millions of users. 2. **New challenges brought by large language models (LLMs)**: - With the popularization of generative LLMs, these models may use copyrighted text for training without authorization, thus infringing on intellectual property rights. - Content creators need a method to prove whether their works have been used to train third - party black - box LLMs. 3. **Applicability to multiple languages and text types**: - A general framework applicable to multiple text types (such as articles, code) and multiple languages is required to ensure a wide range of application scenarios. 4. **Efficient and scalable solutions**: - The proposed solution not only needs to have high robustness and verifiability, but also needs to be able to operate efficiently in a large - scale user group with a reasonable computational cost. To solve these problems, the paper proposes the **WATERFALL** framework, which is a training - free framework specifically for robust and scalable text watermarking, applicable to multiple text types and languages. The main innovations of WATERFALL include: - **Using LLMs as synonym replacement tools for the first time**: Utilize the capabilities of LLMs to enhance the robustness and scalability of watermarks. - **Combining vocabulary permutation and orthogonal perturbation techniques**: These techniques can significantly improve the robustness and verifiability of watermarks while maintaining the fidelity of the text. - **Support for large - scale user groups**: It can operate efficiently among millions of users to meet the needs in practical applications. Through these innovations, WATERFALL performs well in multiple benchmark tests, especially when facing various attacks (such as insertion, deletion, synonym replacement, etc.), it can still maintain high robustness and verifiability.

Waterfall: Framework for Robust and Scalable Text Watermarking and Provenance for LLMs

WaterPark: A Robustness Assessment of Language Model Watermarking

On the Reliability of Watermarks for Large Language Models

Towards Codable Watermarking for Injecting Multi-bits Information to LLMs

Towards Codable Text Watermarking for Large Language Models

Learning to Watermark LLM-generated Text via Reinforcement Learning

A Survey of Text Watermarking in the Era of Large Language Models

WaterBench: Towards Holistic Evaluation of Watermarks for Large Language Models

MarkLLM: An Open-Source Toolkit for LLM Watermarking

REMARK-LLM: A Robust and Efficient Watermarking Framework for Generative Large Language Models

WAPITI: A Watermark for Finetuned Open-Source LLMs

Robust Multi-bit Text Watermark with LLM-based Paraphrasers

Watermarking Techniques for Large Language Models: A Survey

A Statistical Framework of Watermarks for Large Language Models: Pivot, Detection Efficiency and Optimal Rules

Mark My Words: Analyzing and Evaluating Language Model Watermarks

Lost in Overlap: Exploring Watermark Collision in LLMs

WaterMax: breaking the LLM watermark detectability-robustness-quality trade-off

WaterSeeker: Pioneering Efficient Detection of Watermarked Segments in Large Documents

Turning Your Strength into Watermark: Watermarking Large Language Model via Knowledge Injection

Universally Optimal Watermarking Schemes for LLMs: from Theory to Practice

Watermark Stealing in Large Language Models