Emotional Speech Clustering Based Robust Speaker Recognition System

Dongdong Li,Yingchun Yang
DOI: https://doi.org/10.1109/cisp.2009.5304327
2009-01-01
Abstract:Speech with various emotions aggravates the performance of speaker recognition system. The existing speaker modeling disregards the match of the emotional state between training and testing speech, and the systems suffer the lapsus of the emotion recognition as to practical application. We propose an alternative approach that exploits the prosodic difference to cluster affective speech, and then builds corresponding models with the clustered speech for a given speaker. The aim is to match the test utterances with one of the clustered speaker models and utilize the limited affective speech effectively. The method is evaluated with the Mandarin Affective Speech Corpus. Experimental results show that the proposed approach achieves a relative improvement of at least 19% over the traditional speaker recognition task. We also show that such approach are more robust to communication the emotional affects than the other speaker recognition systems.
What problem does this paper attempt to address?