Abstract:In recent years, there has been an increasing number of information hiding techniques based on network streaming media, focusing on how to covertly and efficiently embed secret information into real-time transmitted network media signals to achieve concealed communication. The misuse of these techniques can lead to significant security risks, such as the spread of malicious code, commands, and viruses. Current steganalysis methods for network voice streams face two major challenges: efficient detection under low embedding rates and short duration conditions. These challenges arise because, with low embedding rates (e.g., as low as 10%) and short transmission durations (e.g., only 0.1 second), detection models struggle to acquire sufficiently rich sample features, making effective steganalysis difficult. To address these challenges, this paper introduces a Dual-View VoIP Steganalysis Framework (DVSF). The framework first randomly obfuscates parts of the native steganographic descriptors in VoIP stream segments, making the steganographic features of hard-to-detect samples more pronounced and easier to learn. It then captures fine-grained local features related to steganography, building on the global features of VoIP. Specially constructed VoIP segment triplets further adjust the feature distances within the model. Ultimately, this method effectively address the detection difficulty in VoIP. Extensive experiments demonstrate that our method significantly improves the accuracy of streaming voice steganalysis in these challenging detection scenarios, surpassing existing state-of-the-art methods and offering superior near-real-time performance.

TENet: Leveraging Transformer Encoders for Steganalysis of QIM Steganography in VoIP Speech Streams

Jpeg Quantization-Distribution Steganalytic Method Attacking Jsteg

Universal Methodology for Developing Quantitative Steganalysis

Detection of QIM-Based Steganography in VoIP Streams: A MobileViT-Inspired Model

Blind Jpeg Steganalysis Using Features Derived from Multi-Domain

Steganalysis of VoIP Streams with CNN-LSTM Network.

Practical Deep Learning Models for QIM-based VoIP Steganalysis

Frame-level steganalysis of QIM steganography in compressed speech based on multi-dimensional perspective of codeword correlations

RNN-SM: Fast Steganalysis of VoIP Streams Using Recurrent Neural Network

Steganography in Low Bit-Rate Speech Streams Based on Quantization Index Modulation Controlled by Keys

Efficient Streaming Voice Steganalysis in Challenging Detection Scenarios

Real-time Steganalysis for Streaming Media Based on Multi-Channel Convolutional Sliding Windows

Steganalysis of Compressed Speech to Detect Covert Voice over Internet Protocol Channels

Hierarchical Representation Network for Steganalysis of QIM Steganography in Low-Bit-Rate Speech Signals

A Steganalysis Method for G.729A Compressed Speech Stream Based on Codeword Distribution Characteristics

Steganalysis of Analysis-By-Synthesis Speech Exploiting Pulse-Position Distribution Characteristics

Efficient Blind Steganalysis Algorithm for QIM Encoding

A Covert Communication Model Based on Least Significant Bits Steganography in Voice over IP

Improved Wet Paper Code Using Simplified Hamming Parity-Check Matrix and Its Application in Voice-Over-Ip Steganography

Universal Steganography Model for Low Bit-Rate Speech Codec

Adaptive Voice-over-I P Steganography Based on Quantitative Performance Ranking