Computationally Secure Steganography Based on Speech Synthesis

LI Menghan,CHEN Kejiang,ZHANG Weiming,YU Nenghai
DOI: https://doi.org/10.11959/j.issn.2096−109x.2022025
2022-01-01
Abstract:The steganography theory of computing security has been proposed for a long time, but it has not been widely adopted for mainstream steganography using multimedia data as a carrier.The reason is that the prerequisite for calculating secure steganography is to obtain the accurate distribution of the carrier or to accurately sample according to the carrier distribution.However, naturally collected images and audio/video cannot meet this prerequisite.With the development of deep learning technology, various machine-generated media such as image generation and synthesized speech, have become more and more common on the Internet and then generated media has become a reasonable steganography carrier.Steganography can use normal generated media to cover up secret communications, and pursue in distinguishability from normal generated media.The distribution learned by some generative models is known or controllable, which provides an opportunity to push computational security steganography for practical use.Taking the widely used synthetic speech model as an example, a computationally secure symmetric key steganography algorithm was designed and implemented.The message was decompressed into the synthetic audio according to the decoding process of arithmetic coding based on the conditional probability of sample points, and the message receiver had the same generation model to complete the message extraction by reproducing the audio synthesis process.The public key steganography algorithm was additionally designed based on this algorithm, which provided algorithmic support for the realization of full-flow steganographic communication.Steganographic key exchange ensured the security of steganographic content and the security of steganographic behavior was also achieved.The theoretical analysis showed that the security of the proposed algorithm is determined by the randomness of the embedded message.And the steganography analysis experiment further verified that the attacker cannot distinguish the synthesized carrier audio from the encrypted audio.
What problem does this paper attempt to address?