Abstract:One of the most difficult challenges for speaker recognition is dealing with channel variability. In this paper, several new cross-channel compensation techniques are introduced for a Gaussian mixture model--universal background model (GMM-UBM) speaker verification system. These new techniques include wideband noise reduction, echo cancellation, a simplified feature-domain latent factor analysis (LFA) and data-driven score normalization. A novel dynamic Gaussian selection algorithm is developed to reduce the feature compensation time by more than 60% without any performance loss. The performance of different techniques across varying channel train/test conditions are presented and discussed, finding that speech enhancement, which used to be neglected for telephone speech, is essential for cross-channel tasks, and the channel compensation techniques developed for telephone channel speech also perform effectively. The per microphone performance analysis further shows that speech enhancement can boost the effects of other techniques greatly, especially on channels with larger signal-to-noise ratio (SNR) variance. All results are presented on NIST SRE 2006 and 2008 data, showing a promising performance gain compared to the baseline. The developed system is also compared with other state-of-the-art speaker verification systems. The result shows that the developed system can obtain comparable or even better performance but consumes much less CPU time, making it more suitable for practical use.

Stereo Hidden Markov Modeling for Noise Robust Speech Recognition

Stereo-based Stochastic Mapping with Context Using Probabilistic PCA for Noise Robust Automatic Speech Recognition

Residual Noise Compensation For Robust Speech Recognition In Nonstationary Noise

Synthesized Stereo Mapping Via Deep Neural Networks for Noisy Speech Recognition

Synthesized Stereo-Based Stochastic Mapping with Data Selection for Robust Speech Recognition.

Autoregressive Model-Based Robust Speech Recognition in Additive Noise Environment

Speech Recognition Algorithm Based on Neural Network and Hidden Markov Model

Noise Robust Speaker Recognition Based on Adaptive Frame Weighting in GMM for i-Vector Extraction.

HMM-based pseudo-clean speech synthesis for splice algorithm

An Approach To Robust Speaker Recognition Using Stochastic Matching

Improvement of hidden Markov model (HMM) for speech recognition

Speech Recognition System Based on CDHMM/SOFMNN in Noisy Environment

VTS-based Robust Speech Recognition

Noisy speech recognition performance of discriminative HMMs

Robust Speaker Recognition in Cross-Channel Condition Based on Gaussian Mixture Model

A VTS-based Feature Compensation Approach to Noisy Speech Recognition Using Mixture Models of Distortion

The Hidden Markov Model of co-articulation and its application to the continuous speech recognition

Performance of Discriminative HMM Training in Noise.

Hidden Markov Acoustic Modeling with Bootstrap and Restructuring for Low-Resourced Languages

Joint Training for Simultaneous Speech Denoising and Dereverberation with Deep Embedding Representations