Visual Speech Synthesis Algorithm Based on Chinese Visual Triphone

Hui Zhao,Chao-jing Tang
2009-01-01
Abstract:In order to synthesize real video sequence, a visual speech synthesis algorithm based on Chinese visual triphone is proposed. According to Chinese pronunciation principle and the relationship between phoneme and viseme, conception of 'visual triphone' is presented. Hidden Markov Model(HMM) is established based on visual triphones. In the training stage, combined features including visual features and audio features are used. In the synthesis stage, sentence HMM is constructed by concatenating triphone HMMs, from which the feature parameters are extracted. From the result of subjective and objective evaluation, the synthesized video is real and satisfied.
What problem does this paper attempt to address?