Abstract:One of the most important challenges in speaker recognition is intersession variability (ISV), primarily cross-channel effects. Recent NIST speaker recognition evaluations (SRE) include a multilingual scenario with training conversations involving multilingual speakers collected in a number of other languages, leading to further performance decline. One important reason for this is that more and more researchers are using phonetic clustering to introduce high level information to improve speaker recognition. But such language dependent methods do not work well in multilingual conditions. In this paper, we study both language and channel mismatch using a support vector machine (SVM) speaker recognition system. Maximum likelihood linear regression (MLLR) transforms adapting a universal background model (UBM) are adopted as features. We first introduce a novel language independent statistical binary-decision tree to reduce multi-language effects, and compare this data-driven approach with a traditional knowledge based one. We also construct a framework for channel compensation using feature-domain latent factor analysis (LFA) and MLLR supervector kernel-based nuisance attribute projection (NAP) in the model-domain. Results on the NIST SRE 2006 1conv4w-1conv4w/mic corpus show significant improvement. We also compare our compensated MLLR-SVM system with state-of-the-art cepstral Gaussian mixture and SVM systems, and combine them for a further improvement.

Low-dimensional Representation of Gaussian Mixture Model Supervector for Language Recognition

Maximum Likelihood I-Vector Space Using PCA for Speaker Verification.

Language Recognition With Locality Preserving Projection

MLLR Based SVM Language Identification Algorithm

Local Variability Modeling for Text-Independent Speaker Verification.

Multi-Stream Posterior Features and Combining Subspace Gmms for Low Resource Lvcsr

An automatic language identification method based on subspace analysis

Manifold Regularized Extreme Learning Machine for Language Recognition

A CMLLR supervector kernel for SVM language recognition.

Study to Speaker Recognition Using RVM

A Sample and Feature Selection Scheme for GMM-SVM Based Language Recognition

Intersession Variability Compensation for Language Detection

Speaker recognition with short utterances based on multiple kernel SVM-GMM

Spoken Language Identification Using Score Vector Modeling and Support Vector Machine

Research on Intersession Variability Compensation for MLLR-SVM Speaker Recognition.

Factor analysis based spatial correlation modeling for speaker verification

Regularized Minimum Class Variance Extreme Learning Machine for Language Recognition

Vts Feature Compensation Based on Two-Layer Gmm Structure for Robust Speech Recognition

Mixture of Support Vector Machines for Text-Independent Speaker Recognition

Feature Selection Based On Mutual Information For Language Recognition

Discriminative Vector Space Model Based Language Recognition