DGSNet: Dual Graph Structure Network for Emotion Recognition in Multimodal Conversations

Shimin Tang,Changjian Wang,Fengyu Tian,Kele Xu,Minpeng Xu
DOI: https://doi.org/10.1109/ictai59109.2023.00019
2023-01-01
Abstract:Emotion is an intrinsic property of human beings and emotion recognition in conversations (ERC) contributes to developing human-like machines. To fuse different modality information effectively is the key part for multimodal ERC. Current methodologies employ graph convolution network to fuse multimodal information. They perform well in intra-modal fusion, but poor in cross-modal fusion, resulting a weak integration of multimodal information. To address aforementioned problem, in this paper, we propose a dual graph structure network for emotion recognition in multimodal conversations (DGSNet). Specially, the multimodal fusion mechanism based on the dual graph structure network is designed. The heterogeneity features of each modal are extracted through the separated graph and the complementary features of each modal are extracted through the aggregation graph. Then, the local attention mechanism for emotional dependency is designed to constrain the scope and target of the emotion. It enhances the analysis of emotional dependency. To demonstrate the superior performance of our proposed method, we evaluate it on two benchmarks, IEMOCAP and MELD, and the experimental results show that the DGSNet model can fuse multimodal information effectively and improve the performance of emotion recognition.
What problem does this paper attempt to address?