Identifying Speakers Using Their Emotion Cues

Ismail Shahin
DOI: https://doi.org/10.48550/arXiv.1801.07054
2018-01-22
Sound
Abstract:This paper addresses the formulation of a new speaker identification approach which employs knowledge of emotional content of speaker information. Our proposed approach in this work is based on a two-stage recognizer that combines and integrates both emotion recognizer and speaker recognizer into one recognizer. The proposed approach employs both Hidden Markov Models (HMMs) and Suprasegmental Hidden Markov Models (SPHMMs) as classifiers. In the experiments, six emotions are considered including neutral, angry, sad, happy, disgust and fear. Our results show that average speaker identification performance based on the proposed two-stage recognizer is 79.92% with a significant improvement over a one-stage recognizer with an identification performance of 71.58%. The results obtained based on the proposed approach are close to those achieved in subjective evaluation by human listeners.
What problem does this paper attempt to address?