VISUAL SPEECH SYNTHESIS BASED ON ARTICULATORY TRAJECTORY

Zheng Hongna,Bai Jing,Wang Lan,Zhu Yun
DOI: https://doi.org/10.3969/j.issn.1000-386x.2013.06.067
2013-01-01
Abstract:This paper focuses on speech visualisation.A modified CM co-articulation model is presented in order to display the movement of each articulatory organ of real speakers,it is employed to have synthesised the articulatory trajectory of Chinese characters,and is further used to drive and control a virtual 3D audiovisual talking head model,intuitively shows the articulatory motions of the articulatory organs usually visible and invisible.Experiment proves that the synthesis articulatory trajectory obtained from the modified method approaches more the actual articulatory trajectory.Meanwhile,in order to quantitatively compare the roles of tongue reading and lip reading in articulatory perception and recognition,three groups of perception experiments are designed.Experimental results demonstrate that the perception recognition rate with the lip reading information superposed improves 25.8%compared with the recognition rate of pure speech with noise superposed; while the recognition rate with the tongue reading information superposed is higher by 26.7%than that of the sole audio information.Therefore, when the speech degrades,the tongue reading information can play more supplementation role than the lip reading information,and the tongue reading information has the recognition capability the same as lip reading information.
What problem does this paper attempt to address?