Study on the Identification of the Isolated Word in Mandarin Speech Recognition with Tone Information

王鹏,胡郁,戴礼荣,刘庆峰
DOI: https://doi.org/10.3969/j.issn.1003-0077.2010.04.012
2010-01-01
Abstract:Mandarin is a kind of tonal language and the tone information plays a key role in Mandarin speech recognition.Within the framework of HMM(Hidden Markov Model),how to use tone information effectively is an important and open research issue.In the state-of-art Mandarin speech recognition system,there are two ways to apply tone information: the one is Embedded Tone Model(in which the tone related features are appended to spectral features to form an augmented acoustic feature vectors to train HMM model),the other is Explicit Tone Model(in which the one modeling is separated from syllable modeling and tone model is applied to optimize existed decoding network).This paper presents a way to combine these two methods to identify the isolated word in Mandarin speech recognition.Firstly,we get the Nbest items with Embedded Tone Model based on two-stream model rather than conventional single-stream model.Then the Explicit Tone Model based left dependent tonal model is established to re-score the Nbest items.The method proposed achieves over 5.0% absolute improvement in average in all test sets and up to 5.36% absolute improvement in NoiseCar test set compared with traditional model without tone information.
What problem does this paper attempt to address?