Ordinal Learning for Emotion Recognition in Customer Service Calls.

Wenjing Han,Tao Jiang,Yan Li,Bjoern Schuller,Huabin Ruan
DOI: https://doi.org/10.1109/icassp40776.2020.9053648
2020-01-01
Abstract:Approaches toward ordinal speech emotion recognition (SER) tasks are commonly based on the categorical classification algorithms, where the rank-order emotions are arbitrarily treated as independent categories. To employ the ordinal information between emotional ranks, we propose to model the ordinal SER tasks under a COnsistent RAnk Logits (CORAL) based deep learning framework. Specifically, a multi-class ordinal SER task is transformed into a series of binary SER sub-tasks predicting whether an utterance's emotion is larger than a rank. All the sub-tasks are jointly solved by one single network with a mislabelling cost defined as the the sum of the individual cross-entropy loss for each sub-task. Having the VGGish as our basic network structure, via minimizing above CORAL based cost, a VGGish-CORAL network is implemented in this contribution. Experimental results on a real-world call center dataset and the widely used IEMOCAP corpus demonstrate the effectiveness of VGGish-CORAL compared to the categorical VGGish.
What problem does this paper attempt to address?