Context- and Knowledge-Aware Graph Convolutional Network for Multimodal Emotion Recognition

Yahui Fu,Shogo Okada,Longbiao Wang,Lili Guo,Yaodong Song,Jiaxing Liu,Jianwu Dang
DOI: https://doi.org/10.1109/mmul.2022.3173430
IF: 3.4911
2022-01-01
IEEE Multimedia
Abstract:This work proposes an approach for emotion recognition in conversation that leverages context modeling, knowledge enrichment, and multimodal (text and audio) learning based on a graph convolutional network (GCN). We first construct two distinctive graphs for modeling the contextual interaction and knowledge dynamic. We then introduce an affective lexicon into knowledge graph building to enrich the emotional polarity of each concept, that is the related knowledge of each token in an utterance. Then, we achieve a balance between the context and the affect-enriched knowledge by incorporating them into the new adjacency matrix construction of the GCN architecture, and teach them jointly with multiple modalities to effectively structure the semantics-sensitive and knowledge-sensitive contextual dependence of each conversation. Our model outperforms the state-of-the-art benchmarks by over 22.6% and 11% relative error reduction in terms of weighted-F1 on the IEMOCAP and MELD databases, respectively, demonstrating the superiority of our method in emotion recognition.
What problem does this paper attempt to address?