Abstract:Sentiment analysis and emotion detection in conversation are becoming hot topics in regard to several applications. With the development of the social robot, social network, and intelligent voice assistant, emotion detection is attracting more attention as a key component in these research fields. Many approaches have been proposed to handle this problem in recent years. However, these previous approaches focus on either the temporal change information of the conversation or the semantic correlation information of the dialogue but ignore the combination of temporal information and semantic correlation information. In this paper, we propose an incremental graph convolution network (I-GCN) to handle emotion detection in conversation. We first utilize the graph structure to represent conversation at different times, which can represent the semantic correlation information of utterances. Then, we apply the incremental graph structure to imitate the process of dynamic conversation, which can preserve the temporal change information of conversation. Especially, for the first step of the process, we creatively propose utterance-level GCN (U-GCN) and speaker-level GCN (S-GCN) to learn the features of utterances for emotion detection. U-GCN focuses on the correlations among utterances and applies the multi-head attention model to find latent correlation information among utterances, which aims to further enhance the guidance of semantic relevance for feature learning. S-GCN focuses on the correlation between speaker and utterances, which can provide a different angle to guide the feature learning of utterances. In the learning of model parameters, we constantly utilize the new utterances to fine-tune the parameters of GNN for enhancement of the contribution of temporal change information. Detailed evaluations of the proposed method on three published conversation corpuses demonstrate the great effectiveness of our approach over several conventional competitive baselines.

Context- and Knowledge-Aware Graph Convolutional Network for Multimodal Emotion Recognition

CONSK-GCN - Conversational Semantic- and Knowledge-Oriented Graph Convolutional Network for Multimodal Emotion Recognition.

MMGCN: Multimodal Fusion Via Deep Graph Convolution Network for Emotion Recognition in Conversation

GraphMFT: A Graph Network based Multimodal Fusion Technique for Emotion Recognition in Conversation

Context-Aware Affective Graph Reasoning For Emotion Recognition

Knowledge-Aware Graph Convolutional Network with Utterance-Specific Window Search for Emotion Recognition in Conversations

Synch-Graph: Multisensory Emotion Recognition Through Neural Synchrony Via Graph Convolutional Networks.

MLGAT: multi-layer graph attention networks for multimodal emotion recognition in conversations

Multiple Knowledge-Enhanced Interactive Graph Network for Multimodal Conversational Emotion Recognition

Emotion Recognition in Conversation Based on a Dynamic Complementary Graph Convolutional Network

Affect-GCN: a multimodal graph convolutional network for multi-emotion with intensity recognition and sentiment analysis in dialogues

GraphCFC: A Directed Graph Based Cross-Modal Feature Complementation Approach for Multimodal Conversational Emotion Recognition

Fusion with Hierarchical Graphs for Multimodal Emotion Recognition

Multivariate, Multi-Frequency and Multimodal: Rethinking Graph Neural Networks for Emotion Recognition in Conversation

MFGCN: Multimodal fusion graph convolutional network for speech emotion recognition

Multimodal Knowledge-enhanced Interactive Network with Mixed Contrastive Learning for Emotion Recognition in Conversation

I-GCN: Incremental Graph Convolution Network for Conversation Emotion Detection

Conversational emotion recognition studies based on graph convolutional neural networks and a dependent syntactic analysis

Dense Graph Convolutional with Joint Cross-Attention Network for Multimodal Emotion Recognition