An Improved Feature Fusion for Speaker Recognition
Meixiang Dai,Guojun Dai,Yifan Wu,Yixing Xia,Fangyao Shen,Hua Zhang
DOI: https://doi.org/10.1109/DSC.2019.00035
2019-01-01
Abstract:Speech is the most effective way of communication for humans. Its uniqueness is the basis for speaker recognition. It is a research focus on finding speaker's distinctive speech features for better system performance. In this paper, we propose a novel approach of feature fusion for the speaker recognition. It is based on the ratio of the inter and intra class variance, which is calculated from the different dimensions of traditional features, such as Linear Prediction Cepstral Coefficients (LPCC) and Perceptual Linear Prediction (PLP). The ratio is higher, the component is more significant, which are combined to a new feature. The pitch period and spectral centroid features are added to obtain extra speech features. Experimental results show that the speaker recognition's accuracy by proposed feature fusion can improve 13.26% and 8.74% compared to traditional LPCC and PLP features with a GMM classifier.
What problem does this paper attempt to address?