A Scheme Discriminating Between Synthetic Speech and Normal Speech

Jilun Chen,Weiqiang Zhang,Jia Liu
DOI: https://doi.org/10.1109/icalip.2016.7846613
2016-01-01
Abstract:This paper develops a system to automatically distinguish natural speech from synthetic speech. The issue of feature selection is considered. We take commonly used feature Mel-Frequency Cepstrum Coefficient (MFCC) in consideration, as well as other features such as Relative Phase Shift (RPS) and pitch tuned for Automatically Speech Recognition (ASR). We found some features are complimentary in the task of discriminating synthetic and natural speech. Gaussian Mixture Model Support Vector Machine (GMM-SVM) system is applied as classifier with feature input modified and compared to that of feature is applied in speaker recognition. Experiment on Librespeech versus online Text-to-Speech (TTS) speech synthesis platforms data set verified the effectiveness of the combination of these features.
What problem does this paper attempt to address?