Abstract:Previously, a dereverberation method based on generalized spectral subtraction (GSS) using multi-channel least mean-squares (MCLMS) has been proposed. The results of speech recognition experiments showed that this method achieved a significant improvement over conventional methods. In this paper, we apply this method to distant-talking (far-field) speaker recognition. However, for far-field speech, the GSS-based dereverberation method using clean speech models degrades the speaker recognition performance. This may be because GSS-based dereverberation causes some distortion between clean speech and dereverberant speech. In this paper, we address this problem by training speaker models using dereverberant speech obtained by suppressing reverberation from arbitrary artificial reverberant speech. Furthermore, we propose an efficient computational method for a combination of the likelihood of dereverberant speech using multiple compensation parameter sets. This addresses the problem of determining optimal compensation parameters for GSS. We report the results of a speaker recognition experiment performed on large-scale far-field speech with different reverberant environments to the training environments. The proposed GSS-based dereverberation method achieves a recognition rate of 92.2%, which compares well with conventional cepstral mean normalization with delay-and-sum beamforming using a clean speech model (49.0%) and a reverberant speech model (88.4%). We also compare the proposed method with another dereverberation technique, multi-step linear prediction-based spectral subtraction (MSLP-GSS). The proposed method achieves a better recognition rate than the 90.6% of MSLP-GSS. The use of multiple compensation parameters further improves the speech recognition performance, giving our approach a recognition rate of 93.6%. We implement this method in a real environment using the optimal compensation parameters estimated from an artificial environment. The results show a recognition rate of 87.8% compared with 72.5% for delay-and-sum beamforming using a reverberant speech model.

Speaker Recognition Using DMFCC over Telephone Channels

On the Importance of Components of the MFCC in Speech and Speaker Recognition.

Gender Identification using MFCC for Telephone Applications - A Comparative Study

Multi-feature Combination for Speaker Recognition

Speaker Discrimination on Broadcast News and Telephonic Calls Using a Fusion of Neural and Statistical Classifiers

Modified MFCCs for Robust Speaker Recognition

The predictive differential amplitude spectrum for robust speaker recognition in stationary noises

Distant-talking Speaker Identification by Generalized Spectral Subtraction-Based Dereverberation and Its Efficient Computation

Variant Time-Frequency Cepstral Features for Speaker Recognition

Speaker Identification Using MFCC Feature Extraction ANN Classification Technique

Multi-resolution Time Frequency Feature and Complementary Combination for Short Utterance Speaker Recognition

Real-time Speaker Recognition System for PDA

Dereverberantion Based on Generalized Spectral Subtraction for Distant-Talking Speaker Recognition

Wavelet-Based Mel-Frequency Cepstral Coefficients for Speaker Identification using Hidden Markov Models

Combining Mfcc And Pitch To Enhance The Performance Of The Gender Recognition

Time–Frequency Cepstral Features and Heteroscedastic Linear Discriminant Analysis for Language Recognition

A novel hybrid feature method based on Caelen auditory model and gammatone filterbank for robust speaker recognition under noisy environment and speech coding distortion

Speaker gender recognition based on combining the contribution of MFCC and pitch features

ELM speaker identification for limited dataset using multitaper based MFCC and PNCC features with fusion score

Development of High Accuracy Classifier for the Speaker Recognition System

A hybrid discriminant fuzzy DNN with enhanced modularity bat algorithm for speech recognition