Speech Emotion Recognition with Emotion-Pair Based Framework Considering Emotion Distribution Information in Dimensional Emotion Space.

Xi Ma,Zhiyong Wu,Jia,Mingxing Xu,Helen Meng,Lianhong Cai
DOI: https://doi.org/10.21437/interspeech.2017-619
2017-01-01
Abstract:In this work, an emotion-pair based framework is proposed for speech emotion recognition, which constructs more discriminative feature subspaces for every two different emotions (emotion-pair) to generate more precise emotion bi-classification results. Furthermore, it is found that in the dimensional emotion space, the distances between some of the archetypal emotions are closer than the others. Motivated by this, a Naive Bayes classifier based decision fusion strategy is proposed, which aims at capturing such useful emotion distribution information in deciding the final emotion category for emotion recognition. We evaluated the classification framework on the USC IEMOCAP database. Experimental results demonstrate that the proposed method outperforms the hierarchical binary decision tree approach on both weighted accuracy (WA) and unweighted accuracy (UA). Moreover. our framework possesses the advantages that it can be fully automatically generated without empirical guidance and is easier to be parallelized.
What problem does this paper attempt to address?