Voice conversion based on improved GMM and spectrum with synchronous prosody

ZHANG Bing,YU Yi-biao
DOI: https://doi.org/10.1109/ICOSP.2008.4697217
2008-01-01
Abstract:A new voice conversion approach is proposed based on improved GMM speaker model and short-time spectrum with synchronous prosody. Improved GMM speaker model which is trained by feature vector of original and target speaker can overcome over-smooth phenomenon. The short-time spectrum with prosody is composed of LSF parameter and pitch parameter. It can describe speakerpsilas vocal tract characteristics and exciting characteristics more accurately, comparing with normal methods which the pitch usually set as constant. Experimental results show this method can describe personality and transformation relationship of the source speaker and target speaker effectively. In addition, transformed speech has good quality, while speakerpsilas individuality transformed well.
What problem does this paper attempt to address?