Message-Driven Generative Music Steganography Using MIDI-GAN
Zhaopin Su,Guofu Zhang,Zhiyuan Shi,Donghui Hu,Weiming Zhang
DOI: https://doi.org/10.1109/tdsc.2024.3372139
2024-01-01
IEEE Transactions on Dependable and Secure Computing
Abstract:Generative steganography has become a popular research topic in the field of generative AI, including generative image and synthetic speech steganography. However, music files have different statistical properties and knowledge representation compared to image and speech files, and the reversible transform between secret message and music is also challenging. Therefore, the existing generative steganographic methods that are effective for image/speech may not be directly effective for music. In this paper, we propose a generative music steganography method, named MIDI-GAN, to generate a secret message as an artificial stego MIDI file using generative adversarial networks (GANs). The created stego MIDI file is small in size, has sweet melodies, and is undetectable to deep learning-based steganalyzers. Unlike the previous generative image/speech steganography, the stego MIDI can also be presented as a sequence of chord numbers, making it difficult for anyone to detect and see grounds for suspicion. Moreover, these chord numbers can be transmitted as any other digital or physical medium to evade detection. Specifically, MIDI-GAN comprises a generator, a discriminator, and an extractor. The generator synthesizes a stego MIDI file from the secret message, while the discriminator ensures that the stego MIDI file approaches the authentic rather than the synthetic MIDI file as much as possible in statistical distribution. The extractor recovers the secret message from the stego MIDI file or chord sequence. Experimental results demonstrate that MIDI-GAN has high concealment and security, as the stego MIDI generated by our method is closely similar to the authentic MIDI files and maintains excellent anti-detection ability against deep learning-based steganalysis.