Abstract:In this paper we investigate the GMM-derived (GMMD) features for adaptation of deep neural network (DNN) acoustic models. The adaptation of the DNN trained on GMMD features is done through the maximum a posteriori (MAP) adaptation of the auxiliary GMM model used for GMMD feature extraction. We explore fusion of the adapted GMMD features with conventional features, such as bottleneck and MFCC features, in two different neural network architectures: DNN and time-delay neural network (TDNN). We analyze and compare different types of adaptation techniques such as i-vectors and feature-space adaptation techniques based on maximum likelihood linear regression (fMLLR) with the proposed adaptation approach, and explore their complementarity using various types of fusion such as feature level, posterior level, lattice level and others in order to discover the best possible way of combination. Experimental results on the TED-LIUM corpus show that the proposed adaptation technique can be effectively integrated into DNN and TDNN setups at different levels and provide additional gain in recognition performance: up to 6% of relative word error rate reduction (WERR) over the strong feature-space adaptation techniques based on maximum likelihood linear regression (fMLLR) speaker adapted DNN baseline, and up to 18% of relative WERR in comparison with a speaker independent (SI) DNN baseline model, trained on conventional features. For TDNN models the proposed approach achieves up to 26% of relative WERR in comparison with a SI baseline, and up 13% in comparison with the model adapted by using i-vectors. The analysis of the adapted GMMD features from various points of view demonstrates their effectiveness at different levels.

Speaker Recognition Based on SOINN and Incremental Learning Gaussian Mixture Model

Agmma: A Novel Incremental Adaptation Method And Its Application To Speaker Recognition

Universal background model reduction based efficient speaker recognition

Genetic Algorithms and Fuzzy Approach to Gaussian Mixture Model for Speaker Recognition

Speaker Identification System Based on Hybrid Neural Network

Speaker Identification Based on Classify Feature Sub-space Gaussian Mixture Model and Neural Net Fusion

Using MCE Algorithm to Improve the Performance of Speaker Recognition

Text-independent Speaker Recognition Based on Self-adaptation Compensation Transformation

A genetic classification method for speaker recognition

Mixture of Support Vector Machines for Text-Independent Speaker Recognition

GMM and CNN Hybrid Method for Short Utterance Speaker Recognition

An New Approach for Incremental Speaker Adaptation

Speaker Verification Using Adapted Gaussian Mixture Models

A Discriminative Training Approach for Text-Independent Speaker Recognition.

Speaker recognition with short utterances based on multiple kernel SVM-GMM

Exploring Gaussian mixture model framework for speaker adaptation of deep neural network acoustic models

Speaker recognition using continuous density support vector machines

Study to Speaker Recognition Using RVM

Improving Online Incremental Speaker Adaptation with Eigen Feature Space MLLR.

Application of differential evolution optimization based Gaussian Mixture Models to speaker recognition

A Real Time Speaker Recognition System Based on GMM