Mouth Shape Detection Based on Template Matching and Optical Flow for Machine Lip Reading

Tsuyoshi Miyazaki,Toyoshiro Nakashima,Naohiro Ishii
DOI: https://doi.org/10.4018/ijsi.2013010102
2013-01-01
International Journal of Software Innovation
Abstract:The authors describe an improved method for detecting distinctive mouth shapes in Japanese utterance image sequences. Their previous method uses template matching. Two types of mouth shapes are formed when a Japanese phone is pronounced: one at the beginning of the utterance (the beginning mouth shape, BeMS) and the other at the end (the ending mouth shape, EMS). The authors’ previous method could detect mouth shapes, but it misdetected some shapes because the time period in which the BeMS was formed was short. Therefore, they predicted that a high-speed camera would be able to capture the BeMS with higher accuracy. Experiments showed that the BeMS could be captured; however, the authors faced another problem. Deformed mouth shapes that appeared in the transition from one shape to another were detected as the BeMS. This study describes the use of optical flow to prevent the detection of such mouth shapes. The time period in which the mouth shape is deformed is detected using optical flow, and the mouth shape during this time is ignored. The authors propose an improved method of detecting the BeMS and EMS in Japanese utterance image sequences by using template matching and optical flow.
What problem does this paper attempt to address?