Study of Chinese Viseme

王志明,蔡莲红
DOI: https://doi.org/10.3969/j.issn.1000-310x.2002.03.007
IF: 3.614
2002-01-01
Applied Acoustics
Abstract:MPEG-4 gives the definition of viseme as the physical (visual) configurationof the mouth, tongue and jaw that is visually correlated with the speech sound cor-responding to a phoneme. Based on the study of the visual articulators movement inuttering Chinese speech and of the pronunciation rules, we define 28 basic static visemesof Chinese. We describe these visemes in term of 28 of the total of 68 MPEG-4 FAPs,extract these visemes automatically from AVI files based on speech information, and mea-sured partial FAP values by automatically tracking the mouth contour and some markedpoints. Finally, we give an example of usage of these viseme.
What problem does this paper attempt to address?