Linguistic Steganalysis Merging Semantic and Statistical Features.

Shengnan Guo,Jianyi Liu,Zhongliang Yang,Weike You,Ru Zhang
DOI: https://doi.org/10.1109/lsp.2022.3212630
2022-01-01
IEEE Signal Processing Letters
Abstract:With the rapid development of Natural Language Processing (NLP), more and more linguistic steganography methods have appeared in recent years, which may bring great challenges to the protection of cyberspace security. Due to the powerful feature extraction capabilities of Deep neural networks (DNN) to learn semantic features of large volumes of text, traditional steganalysis methods using manual features have gradually evolved into DNN-based methods. However, whether these DNN-based steganalysis methods can extract enough carrier features to achieve efficient steganalysis so that they can completely replace traditional methods based on handcrafted features remains an open question. To explore the answer, in this letter, we propose a new steganalysis method to integrate semantic and statistical features. We use BERT to extract semantic features and TF-IDF with AutoEncoder to obtain statistical features of the input text. Finally, we design a fusion mechanism to combine these two features. The experimental results show that due to the addition of statistical features, the proposed model can significantly improve the detection performance over current DNN-based linguistic steganalysis models.
What problem does this paper attempt to address?