Improved covariance modeling for maximum likelihood multiple subspace transformations [speech recognition applications]
xi zhou,ye tian,jianlai zhou,beiqian dai
DOI: https://doi.org/10.1109/icassp.2005.1415211
2005-01-01
Abstract:Maximum likelihood (ML) multiple subspace transformation algorithms, such as semi-tied covariance (STC) and multiple heteroscedastic linear discriminant analysis (HLDA), have achieved significant improvement. In STC and multiple HLDA, all the Gaussian components are classified as multiple components sets. In each set, Gaussian components' full covariance, which is estimated by the ML criterion, is used to estimate the linear transformation of this set. However, the full covariance matrix, which contains a large number of free parameters, may not be reliably estimated by the ML criterion. Unreliable full covariance will lead to unreliable linear transformation, and will finally lead to poor recognition results. There have been several algorithms proposed to reliably estimate the full covariance, such as mixture of inverse covariance (MIC), SPAM, and hierarchical correlation compensation (HCC). In this paper, we combine HCC with STC and multiple HLDA. Experiments show that standard STC can achieve 12.47% word error rate (WER) reduction on the RM database, while our HCC+STC can achieve 19.32% WER reduction.