Label Transform Based Cross-Language Speaker Adaptation in Bilingual (Mandarin-English) TTS
Yongjin So,Jia,Yongxin Wang,Lianhong Cai
DOI: https://doi.org/10.1109/icalip.2012.6376754
2012-01-01
Abstract:This paper studies the cross-language speaker adaptation for HMM-based speech synthesis. To solve the problem when the adaptation data and the main corpus are not in the same language, we proposed a label transform based cross-language speaker adaptation approach. In order to transform the phone sequence between English and Chinese, a new Mandarin-English phonetic alphabet - HCSIPA is designed. Then, in addition to the traditional Kullback-Leibler Divergence, a phoneme similarity measure: AMD, which take articulation difference into account, is proposed to get the similarity between phonemes. Finally, a perception-based phoneme mapping strategy is implemented to increase the mapping accuracy between Mandarin and English phonemes. The perceptual tests verify the rationality of our approach. The adapted speeches have high natural quality, and are judged as similar to the target speaker.