Voice Conversion with High Naturalness Using Spectrum and Super-segmental Feature Transform

Yao-e Ding,Yi-biao Yu
DOI: https://doi.org/10.3969/j.issn.1673-047X.2009.04.003
2009-01-01
Abstract:Voice conversion is a technology to convert the individuality of source speaker's speech to the target speaker's. In this paper, a new Chinese voice conversion system is proposed which includes following three parts: spectral envelop transform; exciting source transform; super-segmental prosodic adjustment. In the first and second parts, vector quantization is used to transform the spectral envelope and the exciting (residual) from source speaker's voice to target speaker's voice. In the third part, the super-segmental feature of speech is regulated with the back-propagation network. Experiments show proposed algorithm is effect for voice conversion and producing quite natural voice.
What problem does this paper attempt to address?