From Extraction to Generation: Multimodal Emotion-Cause Pair Generation in Conversations

Heqing Ma,Jianfei Yu,Fanfan Wang,Hanyu Cao,Rui Xia
DOI: https://doi.org/10.1109/taffc.2024.3446646
IF: 13.99
2024-01-01
IEEE Transactions on Affective Computing
Abstract:As an important task in emotion analysis, Multimodal Emotion-Cause Pair Extraction in conversations (MECPE) aims to extract all the emotion-cause utterance pairs from a conversation. However, there are two shortcomings in the MECPE task: 1) it ignores emotion utterances whose causes cannot be located in the conversation but require contextualized inference; 2) it fails to locate the exact causes that occur in vision or audio modalities beyond text. To address these issues, in this paper, we introduce a new task named Multimodal Emotion-Cause Pair Generation in Conversations (MECPG), which aims to identify the emotion utterances with their emotion categories and generate their corresponding causes in a conversation. To tackle the MECPG task, we construct a dataset based on a benchmark corpus for MECPE. We further propose a generative framework named MONICA, which jointly performs emotion recognition and emotion cause generation with a sequence-to-sequence model. Experiments on our annotated dataset show the superiority of MONICA over several competitive systems. Our dataset and source codes will be publicly released.
What problem does this paper attempt to address?