Abstract:For GMM-UBM based text-independent speaker recognition, the performance decreases significantly when the utterance is getting too short, and that is mostly due to the lack of distinguishable information from a single kind of feature. Fusion of different features followed by a dimensionality reduction process has been proved useful to provide a satisfying solution. However, some fusion methods based on the traditional Linear Discriminant Analysis (LDA) may cause the singular matrix problem. Therefore, a Fishervoice based feature fusion method incorporating with the Principal Component Analysis (PCA) and the LDA is proposed, where several features, such as MFCC, PLAR and LPCC, which are commonly used, are concatenated, and then projected into a lower-dimensional subspace. Compared with the baseline GMM-UBM systems using any single feature and using the LDA based fusion method, the proposed one can effectively reduce the equal error rate and give the best performance for text-independent speaker recognition for utterances as short as about 2 seconds.

A Fishervoice Based Feature Fusion Method for Short Utterance Speaker Recognition