Speaker Verification Based on Prosodic Features

Yanhua Long,Wu Guo,Lirong Dai
DOI: https://doi.org/10.3969/j.issn.1004-9037.2010.01.015
2010-01-01
Abstract:In the text-independent speaker recognition system, prosodic features are widely used to verify the speaker identity because they are less sensitive to the channel and noisy effect than cepstral ones. This paper proposes a verification method, called the prosody Gaussian covariance projection-support vector machine (PGCP-SVM). The method is based on the pitch and the energy, and their dynamic features. Different from the conventional techniques, target speaker models are modeled by support vector machine (SVM) based on the Gaussian mixture model (GMM) mean-supervectors. Particullarly, the within-class covariance projection technique is used to the mean-supervectors. The projection approach can improve the prosodic system performance. Combined with the acoustic Mel-frequency cepstral coefficient system, the performance is improved by 9.25%. In the NIST 2006 SRE corpus, the equal error rate (EER) of the combined system can reach 4.9%.
What problem does this paper attempt to address?