An Improved Cross-Language Model Adaptation Method for Speech Synthesis

LIU Hang,LING Zhen-Hua,GUO Wu,LiRong Dai
2011-01-01
Pattern Recognition and Artificial Intelligence
Abstract:Cross-language model adaptation in statistical parametric speech synthesis is used for rapidly constructing a text-to-speech (TTS) system with the target speaker's characteristics when the source and the target speakers' languages are different. In this paper, the conventional cross-language adaptation method based on phone-mapping and triphone models is improved by two means. Firstly, phone mapping combined with data-selection is adopted to improve its reliability. Secondly, cross-language prosodic information mapping is introduced to make use of prosodic information, which is ignored in the triphone model. Experiments on Chinese-to-English adaptation show that the synthesized speech using the improved method has much better naturalness and speaker similarity compared with the result of conventional method.
What problem does this paper attempt to address?