A Fishervoice Based Feature Fusion Method for Short Utterance Speaker Recognition

Chenhao Zhang,Thomas Fang Zheng
DOI: https://doi.org/10.1109/chinasip.2013.6625320
2013-01-01
Abstract:For GMM-UBM based text-independent speaker recognition, the performance decreases significantly when the utterance is getting too short, and that is mostly due to the lack of distinguishable information from a single kind of feature. Fusion of different features followed by a dimensionality reduction process has been proved useful to provide a satisfying solution. However, some fusion methods based on the traditional Linear Discriminant Analysis (LDA) may cause the singular matrix problem. Therefore, a Fishervoice based feature fusion method incorporating with the Principal Component Analysis (PCA) and the LDA is proposed, where several features, such as MFCC, PLAR and LPCC, which are commonly used, are concatenated, and then projected into a lower-dimensional subspace. Compared with the baseline GMM-UBM systems using any single feature and using the LDA based fusion method, the proposed one can effectively reduce the equal error rate and give the best performance for text-independent speaker recognition for utterances as short as about 2 seconds.
What problem does this paper attempt to address?