Deepanway Ghosal,Navonil Majumder,Alexander Gelbukh,Rada Mihalcea,Soujanya Poria
Abstract:In this paper, we address the task of utterance level emotion recognition in conversations using commonsense knowledge. We propose COSMIC, a new framework that incorporates different elements of commonsense such as mental states, events, and causal relations, and build upon them to learn interactions between interlocutors participating in a conversation. Current state-of-the-art methods often encounter difficulties in context propagation, emotion shift detection, and differentiating between related emotion classes. By learning distinct commonsense representations, COSMIC addresses these challenges and achieves new state-of-the-art results for emotion recognition on four different benchmark conversational datasets. Our code is available at <a class="link-external link-https" href="https://github.com/declare-lab/conv-emotion" rel="external noopener nofollow">this https URL</a>.
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve
The paper aims to address the problem of Emotion Recognition in Conversations (ERC). Specifically, the authors propose a new framework called COSMIC (COmmonSense knowledge for eMotion Identification in Conversations), which improves the task of emotion recognition in conversations by incorporating commonsense knowledge.
### Background and Challenges
1. **Complexity**: Natural conversations are influenced by multiple variables, including topics, viewpoints, speaker personalities, argument logic, intentions, etc. These variables make the emotional dynamics of conversations very complex.
2. **Context Propagation**: Existing state-of-the-art methods face difficulties in context propagation, making it hard to capture emotional changes in conversations.
3. **Emotion Category Distinction**: Distinguishing between related emotion categories is also a challenge, such as the difference between anger and irritation.
### Solution
1. **Commonsense Knowledge**: The COSMIC framework leverages different elements of commonsense knowledge, such as mental states, events, and causal relationships, to learn the interactions between participants in a conversation.
2. **Multi-stage Modeling**:
- **Independent Context Feature Extraction**: Uses pre-trained Transformer language models (e.g., RoBERTa) to extract features independent of the context.
- **Commonsense Feature Extraction**: Extracts commonsense features from commonsense knowledge graphs.
- **Incorporating Commonsense Knowledge**: Designs better context representations and uses them for the final emotion classification.
### Experimental Results
1. **Datasets**: The paper conducts experiments on four different benchmark conversation datasets, including IEMOCAP, DailyDialog, MELD, and EmoryNLP.
2. **Performance Improvement**: COSMIC achieves new state-of-the-art results on all four datasets, particularly excelling in emotion recognition accuracy and context understanding.
### Conclusion
By incorporating commonsense knowledge, the COSMIC framework can better understand and predict the emotional dynamics in conversations, addressing the shortcomings of existing methods in context propagation and emotion category distinction.