An Emotional Text-Driven 3D Visual Pronunciation System for Mandarin Chinese

Lingyun Yu,Changwei Luo,Jun Yu
DOI: https://doi.org/10.1007/978-981-10-3002-4_8
2016-01-01
Abstract:This paper proposes an emotional text-driven 3D visual pronunciation system for Mandarin Chinese. Firstly, based on an articulatory speech corpus collected by Electro-Magnetic Articulography (EMA), the articulatory features are trained by Hidden Markov model (HMM), and the fully context-dependent modeling is taken into account by making full use of the rich linguistic features. Secondly, considering the fact that the emotion is more remarkably adjusted in the articulatory domain owing to the independency in the manipulation of articulators, the differences between articulatory movements in different emotions are investigated. Thirdly, the emotional speech is generated by adjusting the speech parameters, such as fundamental frequency (F0), duration and intensity, based on Praat. Then when playing the generated emotional speech, the corresponding articulatory movements are synthesized by the HMM prediction rules simultaneously which is used to drive the head mesh model along with emotional speech. The experiments demonstrate the system can synthesize accurate emotional speech synchronized animation of articulators at phoneme level.
What problem does this paper attempt to address?