Discriminative Hmm Stream Model For Mandarin Digit String Speech Recognition

Yy Shi,J Liu,Rs Liu
DOI: https://doi.org/10.1109/ICOSP.2002.1181109
2002-01-01
Abstract:The conventional hidden Markov model (HMM) only based on the spectral feature has not high recognition performance for the connected Mandarin digits, because highly confusable syllables exist. The main problems of Mandarin digit recognition are analyzed. It is revealed that to establish the precise classification models for Mandarin digits not only features extracted from spectrum, energy and pitch contour are necessary but also they should be used with different emphases for different digits. So each type of feature is used to train a single-stream HMM by maximum likelihood. Then a multi-stream HMM is obtained by combining the single-stream HMMs with exponents that weigh the log-likelihood of each stream. The exponents are estimated by. means of the generalized probabilistic descent algorithm according to the digit minimum classification error rate criteria. The superiority of the multi-stream HMM is demonstrated: the relative string error rate is reduced by 54.5%. And the unknown length digit string error rate and its digit error rate decrease to 4.66% and 1.31% respectively.
What problem does this paper attempt to address?