Modeling Pronunciation Variation Using Context-Dependent Weighting and B/s Refined Acoustic Modeling.

Thomas Fang Zheng,Zhanjiang Song,Pascale Fung,William Byrne
DOI: https://doi.org/10.21437/eurospeech.2001-13
2001-01-01
Abstract:The pronunciation variability is an important issue that must be faced with when developing practical automatic spontaneous speech recognition systems. By studying the initial/final (IF) characteristics of Chinese language and developing the Bayesian equation, we propose the concepts of generalized initial/final (GIF) and generalized syllable (GS), the GIF modeling method and the IF-GIF modeling method, as well as the context-dependent pronunciation weighting method. By using these approaches, the IF-GIF modeling reduces the Chinese syllable error rate (SER) by 6.3% and 4.2% compared with the GIF modeling and IF modeling respectively when the language modeling, such as syllable or word N-gram, is not used.
What problem does this paper attempt to address?