Improving Speaker Recognition by Training on Emotion-Added Models

Tian Wu,Yingchun Yang,Zhaohui Wu
DOI: https://doi.org/10.1007/11573548_49
2005-01-01
Abstract:In speaker recognition applications, the changes of emotional states are main causes of errors. The ongoing work described in this contribution attempts to enhance the performance of automatic speaker recognition (ASR) systems on emotional speech. Two procedures that only need a small quantity of affective training data are applied to ASR task, which is very practical in real-world situations. The method includes classifying the emotional states by acoustical features and generating emotion-added model based on the emotion grouping. Experimental works are performed on Emotional Prosody Speech (EPS) corpus and show significant improvement in EERs and IRs compared with baseline and comparative experiments.
What problem does this paper attempt to address?