GMM-HMM Acoustic Model Training by a Two Level Procedure with Gaussian Components Determined by Automatic Model Selection

Dan Su,Xihong Wu,Lei Xu
DOI: https://doi.org/10.1109/icassp.2010.5495122
2010-01-01
Abstract:This paper investigates the Bayesian Ying-Yang (BYY) learning for speech recognition via Gaussian mixture models (GMMs) based Hidden Markov models (HMMs). A two level procedure is proposed with the hidden Markov level trained still under the maximum likelihood principle by the Baum-Welch algorithm but with the GMMs level trained under the BYY best harmony. We proposed a new batch way EM-like Ying-Yang alternation algorithm and used it as a plug-in block to the Baum-Welch algorithm. The advantage is that number of GMM components can be automatically determined during this BYY harmony learning and that the resulted model parameters become less affected than EM-ML training by the problem of overfitting and singular solution. In comparison with the standard EM-ML training and classical model selection criterions, including BIC and AIC, speech recognition experiments in a large vocabulary task on the Hub4 broadcast news database shown that the proposed algorithm provides an improved performance and also good convergence.
What problem does this paper attempt to address?