Learning based visual speech synthesis system

Xibin Jia,Baocai Yin,Yanfen Sun,Xianpin Lin
2006-01-01
Journal of Information and Computational Science
Abstract:The paper introduces a speech-driven visual speech synthesis system. The loosing-coupled-mapping scheme is proposed to establish the correspondence between the acoustic speech class and the visual speech class. Employing the data-driven method in the recorded video enables one learn the mapping scheme. To enhance the correlation between the vocal and the visual speech, the articulatory-lip-correlative-speech mode is extracted by using the genetic algorithm. The results show that the extracted feature can make the corresponding lip image class have a good clustering performance. At the synthesis phase, the serial smooth lip images are received by the searching approach in accordance with the input speech. Compared with the original video, the experiment shows that synthetic visual speech achieves a good result. Moreover, further research should be done at the synthesis phase in order to correct the jerky phenomena.
What problem does this paper attempt to address?