Phone modeling and combining discriminative training for Mandarin-English bilingual speech recognition

Yanmin Qian,Jia Liu
DOI: https://doi.org/10.1109/ICASSP.2010.5495112
2010-01-01
Abstract:Automatic multilingual speech recognition is always a difficult task. This paper presents recent work on the development of a Mandarin-English bilingual speech recognition system. A unified single set of bilingual acoustic models based on a novel State-Time-Alignment (STA) method is proposed to balance the performance and the complexity of the bilingual speech recognition system, and a comparison with the acoustic-likelihood method is presented. Discriminative training approaches such as MPE and fMPE have been shown to improve monolingual recognition performance, but have not yet been applied to bilingual speech recognition. This paper investigates the use of discriminative training methods on bilingual speech recognition, including MPE and fMPE. Experimental results show that the STA phone clustering method outperforms other existing phone clustering methods, and both forms of discriminative training reduce the word error rate of the multilingual system. ©2010 IEEE.
What problem does this paper attempt to address?