Exposing Speech Resampling Manipulation by Local Texture Analysis on Spectrogram Images

Yujin Zhang,Shuxian Dai,Wanqing Song,Lijun Zhang,Dongmei Li
DOI: https://doi.org/10.3390/electronics9010023
IF: 2.9
2020-01-01
Electronics
Abstract:Speech tampering may be aided by the resampling operation. It is significant for speech forensics to effectively detect the resampling; however, there are few studies on speech resampling detection. The purpose of this paper was therefore to provide a new training ideal to detect speech resampling. After resampling, the speech signal changes regularly in the time–frequency domain. In this paper, we theoretically analyzed the corresponding relationship between time domain and frequency domain of the resampled speech. Compared with the original speech, the bandwidth of resampled speech was stretched or compressed. First, the spectrogram was generated by short-time Fourier transform (STFT) from the speech. Then, the local binary pattern (LBP) operator was applied to model the statistical changes in the spectrogram and the LBP histogram was calculated as discriminative features. Finally, a support vector machine (SVM) was applied to classify the developed features to identify whether the speech had undergone the resampling operation. The experimental results show that the proposed method has superior detection performance in different resampling scenarios than some existing methods, and the proposed features are very robust against the commonly used compression post-processing operation. This highlights the promising potential of the proposed method as a speech resampling detection tool in practical forensics applications.
What problem does this paper attempt to address?