Research on visual automatic speech recognition

Jun He,Hua Zhang
2008-01-01
Journal of Information and Computational Science
Abstract:To solve the problem of realizing visual automatic speech recognition under real environment, which is essential but skipped by most researchers because of its difficulties. In this paper we present our novel solutions to three questions in real environment: location of region of interest, segment of speaking frames and visual feature extraction. Firstly, based on the chrominance and R/G discrepancy of skin and lip, locate the Region of interesting speech information exactly; Secondly, based on energy difference of successive frames, segment the speaking frames in successive frames reliably; and then, based on appropriate linear image transform on region of interest, we extract the informative visual speech features. Finally, Based on HMM, We tested our algorithms on HIIBi-CAV Datbase and in real time environment respectively. Experimental results show that our algorithms are encouraging and effective. © 2008 by Binary Information Press.
What problem does this paper attempt to address?