Abstract:Previously, a dereverberation method based on generalized spectral subtraction (GSS) using multi-channel least mean-squares (MCLMS) has been proposed. The results of speech recognition experiments showed that this method achieved a significant improvement over conventional methods. In this paper, we apply this method to distant-talking (far-field) speaker recognition. However, for far-field speech, the GSS-based dereverberation method using clean speech models degrades the speaker recognition performance. This may be because GSS-based dereverberation causes some distortion between clean speech and dereverberant speech. In this paper, we address this problem by training speaker models using dereverberant speech obtained by suppressing reverberation from arbitrary artificial reverberant speech. Furthermore, we propose an efficient computational method for a combination of the likelihood of dereverberant speech using multiple compensation parameter sets. This addresses the problem of determining optimal compensation parameters for GSS. We report the results of a speaker recognition experiment performed on large-scale far-field speech with different reverberant environments to the training environments. The proposed GSS-based dereverberation method achieves a recognition rate of 92.2%, which compares well with conventional cepstral mean normalization with delay-and-sum beamforming using a clean speech model (49.0%) and a reverberant speech model (88.4%). We also compare the proposed method with another dereverberation technique, multi-step linear prediction-based spectral subtraction (MSLP-GSS). The proposed method achieves a better recognition rate than the 90.6% of MSLP-GSS. The use of multiple compensation parameters further improves the speech recognition performance, giving our approach a recognition rate of 93.6%. We implement this method in a real environment using the optimal compensation parameters estimated from an artificial environment. The results show a recognition rate of 87.8% compared with 72.5% for delay-and-sum beamforming using a reverberant speech model.

Robust distant speaker recognition based on position-dependent CMN by combining speaker-specific GMM with speaker-adapted HMM

Robust Distant Speaker Recognition Based on Position Dependent Cepstral Mean Normalization

Robust Distant Speech Recognition Based on Position Dependent CMN.

Robust Distant Speech Recognition Based on Position Dependent CMN Using a Novel Multiple Microphone Processing Technique.

Distant Speech Recognition Based On Position Dependent Cepstral Mean Normalization

Robust Speech Recognition in Distant Environment Based on Speaker Position and Speaking Direction Detection

Robust Distant Speech Recognition by Combining Multiple Microphone-Array Processing with Position-Dependent CMN

Robust Speech Recognition by Combining Short-Term and Long-Term Spectrum Based Position-Dependent CMN with Conventional CMN

Analysis of Effect of Compensation Parameter Estimation for Cmn on Speech/Speaker Recognition

Distant Speaker Recognition Based on the Automatic Selection of Reverberant Environments Using GMMs

Unified adaptation approach for robust speech recognition

Distant-talking Speaker Identification by Generalized Spectral Subtraction-Based Dereverberation and Its Efficient Computation

Distant-Talking Speech Recognition Based On Spectral Subtraction By Multi-Channel Lms Algorithm

Robust Speaker Recognition Algorithm

Text-independent Speaker Identification Based on MAP Channel Compensation and Pitch-dependent Features

Compensation of Speech Enhancement Distortion for Robust Speech Recognition

Dereverberantion Based on Generalized Spectral Subtraction for Distant-Talking Speaker Recognition

Robust speaker recognition using glottal information‐based cepstral mean subtraction

Robust Speaker Recognition in Cross-Channel Condition Based on Gaussian Mixture Model

Text-independent Speaker Recognition Based on Self-adaptation Compensation Transformation

Robust distant automatic speaker identification in reverberant environment