Cross-Corpus Speech Emotion Recognition Based on Hybrid Neural Networks

Abdul Rehman,Zhen-Tao Liu,Dan-Yun Li,Bao-Han Wu
DOI: https://doi.org/10.23919/ccc50068.2020.9189368
2020-01-01
Abstract:Speech emotion recognition helps enrich next-generation AI with emotional intelligence abilities by grasping the emotion from voice and words. At the current stage, speech emotion recognition (SER) is only used within experimental boundaries. The current challenge facing the SER research is the lack of robustness across cultures, languages and even minor differences such as age-gaps of speakers. To create a more adaptable SER in adversarial circumstances, we propose hybrid neural networks architecture that creates a holistic model by embedding the Mel Frequency Cepstrum Coefficients as one-hot inputs such that differences in coefficients in each emotional category are inflated according to their importance. We performed experiments on three different databases to test the cross-corpus effectiveness of the proposed model.
What problem does this paper attempt to address?