Abstract:Speaker recognition (SRE), also called as voiceprint recognition, is the problem of determining the identity of the speaker from a sample of speech signal. It is an important branch of speech signal processing and has many potential applications such as in telephone banking, access control, information security, law enforcement and other forensic applications (Bimbot et al., 2004; Campbell Jr., 1997; Cole et al., 1997; Kinnunen & Li, 2010; Reynolds, 2002). Compared with other biometrics techniques, speaker recognition has its own advantages: (1) It is very convenient, natural and low-cost to acquire the speech sample: it does not need the special devices; the telephone, mobile phone or ordinary microphone is adequate. (2) It can be used remotely: with the ubiquitous telecommunications networks and the Internet, the speech sample can be easily transferred through telephone or VoIP, which makes the remote recognition possible. (3) The speech sample contains many inborn characters: from the speech, we can extract some information about vocal tract, mouth, tongue, soft palate, nasal cavity, and etc. (4) The speech sample also contains some acquired characters, such as tone, volume, pace, rhythm, rhetoric, which reflect speaker’s place of living, education level, and some personal habits information. In speaker recognition, the Gaussian mixture model universal background model (GMM-UBM) is a classical yet widely used method for text-independent speaker verification (Reynolds et al., 2000). In this method, the target speaker is modeled as a GMM and the imposters are modeled as a UBM. When testing, the speech sample is scored as likelihood by the GMM and UBM respectively, and then the likelihood ratio hypothesis test is used for speaker verification. Besides the GMM-UBM, several other methods are developed recently. The most successful ones include the support vector machine using GMM supper vector (GSV-SVM) (Campbell et al., 2006), which concatenate the GMM mean vectors as the input for SVM training and test, and joint factor analysis (JFA) (Kenny et al., 2007), which jointly models the channel subspace and the speaker subspace. Although other methods achieve rapid progress, GMM-UBM is still the basis for their developments. As the meanwhile, the discriminative technologies, such as minimum classification error (MCE), maximum mutual information (MMI), minimum phone error (MPE), feature domain MPE (fMPE), have been achieved great success in speech recognition and language recognition (Burget et al., 2006; Juang & Katagiri, 1992; Povey & Kingsbury, 2007; Woodland & Povey, 2002). 12

Notice of Retraction Speaker classification based on high dimension feature vector

Score Regulation Based on GMM Token Ratio Similarity for Speaker Recognition

Learning Virtual HD Model for Bi-model Emotional Speaker Recognition

Notice of Retraction A Selection Methods of Feature Attributes Based on RS-SVM

Speaker Classification Algorithm Based on Spatial Acoustic Feature

Notice of Retraction Evaluating instructors' performance based on Set Pair Analysis

Variant Time-Frequency Cepstral Features for Speaker Recognition

TRSD: A Time-Varying and Region-Changed Speech Database for Speaker Recognition

A Novel I-Vector Framework Using Multiple Features and PCA for Speaker Recognition in Short Speech Condition

Dereverberantion Based on Generalized Spectral Subtraction for Distant-Talking Speaker Recognition

Retracted: A Convolutional Network-Based Intelligent Evaluation Algorithm for the Quality of Spoken English Pronunciation

Development of High Accuracy Classifier for the Speaker Recognition System

Retracted: Music Classification and Detection of Location Factors of Feature Words in Complex Noise Environment

Retracted: Note Detection in Music Teaching Based on Intelligent Bidirectional Recurrent Neural Network

Speaker Clustering Algorithm in Speech Recognition

Discriminative Universal Background Model Training for Speaker Recognition

A Spatial Long-Term Iterative Mask Estimation Approach for Multi-Channel Speaker Diarization and Speech Recognition.

A Novel Discriminant Locality Preserving Projections for MDM-based Speaker Classification

DNN-based Discriminative Scoring for Speaker Recognition Based on i-vector

Multi-resolution Time Frequency Feature and Complementary Combination for Short Utterance Speaker Recognition

Research on truncated speech in speaker verification