Linguistic Steganography by Sampling-based Language Generation.

Rui Yang,Zhen-Hua Ling
DOI: https://doi.org/10.1109/apsipaasc47483.2019.9023313
2019-01-01
Abstract:Linguistic steganography aims to hide secret messages within text carriers. In this paper, we propose a linguistic steganography method by means of sampling-based language generation. Comparing with deterministic text generation using beam-search, the sampling-based approach increases the redundancy of generated texts and benefits the hiding of information. The arithmetic coding (AC) algorithm is adopted to embed messages in our proposed method. Its performance is compared with fixed-length coding (FLC) and variable-length coding (VLC) which were designed for embedding messages during deterministic text generation. Besides, the KL divergence and temperature based strategies are designed to control the embedding rates of FLC, VLC and AC respectively. Experiments using a story generation model show that AC performed better than FLC and VLC when embedding messages during sampling-based text generation. With an embedding rate of 1.45 bits/word, our AC-based steganography method achieved ideal imperceptibility, and the subjective quality of its generated text is as good as the non-steganography one.
What problem does this paper attempt to address?