3D Visible Speech Animation Driven by Chinese Prosody Markup Language

Siguang Zhang,Lichun Wang,Hengliang Tang
DOI: https://doi.org/10.1109/alpit.2008.56
2008-01-01
Abstract:This paper proposes a new approach for generating smart 3D speech animation. The basic idea is to synthesize the animated faces using prosodic information edited by user with a kind of markup language. The proposed technique takes advantage of both performance-driven and parameter-driven approaches. So it greatly reduces the workload of manual modeling used in the traditional key frame animation and the animation generating process can be easily control. To relate the prosody text with the 3D animation, our technique builds up a parametric model based on the exponential formula. It takes the pre-obtained 3D dynamic visemes and prosodic tag recorded in CPML (Chinese Prosody Markup Language) as input data, and outputs a segment of vivid speech animation. Experimental results show that (1) the proposed technique synthesizes animation of different effects depending on the availability with the prosodic information, and (2) the new technique produces realistic results using less data than the conventional methods.
What problem does this paper attempt to address?