A VTS-based Feature Compensation Approach to Noisy Speech Recognition Using Mixture Models of Distortion

Jun Du,Qiang Huo
DOI: https://doi.org/10.1109/icassp.2013.6639035
2013-01-01
ICASSP
Abstract:Recently, we proposed an approach to irrelevant variability normalization (IVN) based joint training of a reference Gaussian mixture model (GMM) for feature compensation and hidden Markov models (HMMs) for acoustic modeling by using a vector Taylor series (VTS) based feature compensation technique, where single-component densities are used to model additive noise and convolutional distortion respectively. In this paper, mixtures of densities are used to enhance the distortion model. New formulations for maximum likelihood (ML) estimation of distortion model parameters, and minimum mean squared error (MMSE) estimation of clean speech are derived and presented. A comparative study is conducted under three “training-testing” conditions on Aurora3 database. Experimental results confirm that the proposed mixture models of distortion can achieve significant performance gain compared with the traditional distortion modeling.
What problem does this paper attempt to address?