Natural Language Watermarking Via Paraphraser-Based Lexical Substitution.

Jipeng Qiang,Shiyu Zhu,Yun Li,Yi Zhu,Yunhao Yuan,Xindong Wu
DOI: https://doi.org/10.1016/j.artint.2023.103859
IF: 14.4
2023-01-01
Artificial Intelligence
Abstract:Although powerful pretrained language models generate high-quality output text, they bring new concerns about the potential misuse of such models for malicious purposes. Natural language watermarking (NLW) is a technique that is desgined to help tracing the provenance of texts for againsting possible attacks, where the watermark signals are embedded into cover texts using synonym substitutions. The up-to-date BERT-based NLW methods have made remarkable progress on performance improvement of watermarking through generating substitutes for a masked target word. Yet, the BERT-based NLWs focus on the context of texts rather than the meaning of target words, which might make the capacity of watermark embeddings being lower. To address the limitations, this study proposes a novel NLW method by incorporating a paraphraser-based lexical substitution method. Under the promise of paraphrase preservation, the proposed NLW method utilizes the knowledge of paraphrase modeling to generate the substitute candidates to replace the words in original sentences capable of carrying the watermark signal in local contexts. We empirically show that our NLW method not only has a better meaning-preserved, but improves the payload more than 2 times compared with the BERT-based NLW method. Besides, compared with previous state-of-the-art method Compared with other state-of-the-art baselines, the experimental results show that the proposed LS method improves the Precision@1 score from 51.7% to 58.3% and from 50.5% to 62.6% on LS07 and CoInCo benchmarks, respectively.
What problem does this paper attempt to address?