Hands-free speaker identification based on spectral subtraction using a multi-channel least mean square approach

Longbiao Wang,Zhaofeng Zhang,Atsuhiko Kai
DOI: https://doi.org/10.1109/ICASSP.2013.6639065
2013-01-01
ICASSP
Abstract:A dereverberation method based on generalized spectral subtraction (GSS) using a multi-channel least mean square (MCLMS) approach achieved significantly improved results on speech recognition experiments compared with conventional methods. In this study, we employ this method for hands-free speaker identification. The GSS-based dereverberation method using clean speech models degrades speaker identification performance, although it is very effective for speech recognition. One reason may be that the GSS-based dereverberation method causes distortion such as distortion characteristics between clean speech and dereverberant speech. In this study, we address this problem by training speaker models using dereverberant speech, which is obtained by suppressing reverberation from arbitrary artificial reverberant speech. We also propose a method that combines various compensation parameter sets to improve speaker identification and provide an efficient computational method. The speaker identification experiment was performed on large-scale farfield speech, with reverberant environments different to the training environments. The proposed method achieved a relative error reduction of 87.5%, compared with conventional cepstral mean normalization with beamforming using clean speech models, and 44.8% compared with reverberant speech models.
What problem does this paper attempt to address?