Adversarial Post-Processing of Voice Conversion Against Spoofing Detection

Yi-Yang Ding,Jing-Xuan Zhang,Li-Juan Liu,Yuan Jiang,Yu Hu,Zhen-Hua Ling
2020-01-01
Abstract:With the development of speech synthesis and voice conversion techniques, the anti-spoofing task that detects artificial speech signals has received more and more research attentions recently. State-of-the-art spoofing detectors can distinguish the utterances generated by voice conversion from natural ones with high accuracy. This paper proposes a method that improves the ability of voice conversion models against spoofing detection by post-processing the converted speech using a neural network. The network is built using long short-term memories (LSTM) and trained by reducing the distance between the linear frequency cepstrum coefficients (LFCC) of converted utterances and natural references. In our experiments, the SAS dataset was adopted to construct the anti-spoofing system, and the VCTK dataset was used to build voice conversion models. Experimental results show that our proposed method can reduce the detection rate of the anti-spoofing system significantly without losing subjective performance of converted speech.
What problem does this paper attempt to address?