Abstract:One of the most difficult challenges for speaker recognition is dealing with channel variability. In this paper, several new cross-channel compensation techniques are introduced for a Gaussian mixture model--universal background model (GMM-UBM) speaker verification system. These new techniques include wideband noise reduction, echo cancellation, a simplified feature-domain latent factor analysis (LFA) and data-driven score normalization. A novel dynamic Gaussian selection algorithm is developed to reduce the feature compensation time by more than 60% without any performance loss. The performance of different techniques across varying channel train/test conditions are presented and discussed, finding that speech enhancement, which used to be neglected for telephone speech, is essential for cross-channel tasks, and the channel compensation techniques developed for telephone channel speech also perform effectively. The per microphone performance analysis further shows that speech enhancement can boost the effects of other techniques greatly, especially on channels with larger signal-to-noise ratio (SNR) variance. All results are presented on NIST SRE 2006 and 2008 data, showing a promising performance gain compared to the baseline. The developed system is also compared with other state-of-the-art speaker verification systems. The result shows that the developed system can obtain comparable or even better performance but consumes much less CPU time, making it more suitable for practical use.

Exploiting Glottal Information in Speaker Recognition Using Parallel GMMs

Glottal Information Based Spectral Recuperation in Multi-channel Speaker Recognition

Score Regulation Based on GMM Token Ratio Similarity for Speaker Recognition

Preliminary Study on Self-contained UBM Construction for Speaker Recognition.

GMM-ResNext: Combining Generative and Discriminative Models for Speaker Verification

Recuperating Spectral Features Using Glottal Information And Its Application To Speaker Recognition

Modeling high-level information by using Gaussian mixture correlation for GMM-UBM based speaker recognition

GMM-HMM Acoustic Model Training by a Two Level Procedure with Gaussian Components Determined by Automatic Model Selection

Combined GMM-UBM and SVM Speaker Identification System

Discriminative Dynamic Gaussian Mixture Selection with Enhanced Robustness and Performance for Multi-Accent Speech Recognition

Gated Recurrent Units Based Hybrid Acoustic Models for Robust Speech Recognition

Robust Speaker Recognition in Cross-Channel Condition Based on Gaussian Mixture Model

Non-parallel training for voice conversion based on FT-GMM

Multi-Stream Posterior Features and Combining Subspace Gmms for Low Resource Lvcsr

Discriminative training of GMM-HMM acoustic model by RPCL type Bayesian Ying-Yang harmony learning

Ivn-Based Joint Training of Gmm and Hmms Using an Improved Vts-Based Feature Compensation for Noisy Speech Recognition

Combining Mfcc And Pitch To Enhance The Performance Of The Gender Recognition

Mixture of Support Vector Machines for Text-Independent Speaker Recognition

Robust speaker recognition using glottal information‐based cepstral mean subtraction

Investigation of Frame Alignments for GMM-based Digit-prompted Speaker Verification.

Speaker Verification Using Adapted Gaussian Mixture Models