A Persona-Infused Cross-Task Graph Network for Multimodal Emotion Recognition with Emotion Shift Detection in Conversations

Geng Tu,Feng Xiong,Bin Liang,Ruifeng Xu
DOI: https://doi.org/10.1145/3626772.3657944
2024-01-01
Abstract:Recent research in Multimodal Emotion Recognition in Conversations (MERC) focuses on multimodal fusion and modeling speaker-sensitive context. In addition to contextual information, personality traits also affect emotional perception. However, current MERC methods solely consider the personality influence of speakers, neglecting speaker-addressee interaction patterns. Additionally, the bottleneck problem of Emotion Shift (ES), where consecutive utterances by the same speaker exhibit different emotions has been long neglected in MERC. Early ES research fails to distinguish diverse shift patterns and simply introduces whether shifts occur as knowledge into the MERC model without considering the complementary nature of the two tasks. Based on this, we propose a Persona-infused Cross-task Graph Network (PCGNet). It first models the speaker-addressee interactive relationships by the persona-infused refinement network. Then, it learns the auxiliary task of ES Detection and the main task of MERC using cross-task connections to capture correlations across two tasks. Finally, we introduce shift-aware contrastive learning to discern diverse shift patterns. Experimental results demonstrate that PCGNet outperforms state-of-the-art methods on two widely used datasets.
What problem does this paper attempt to address?