Recent Trends of Multimodal Affective Computing: A Survey from NLP Perspective

Guimin Hu,Yi Xin,Weimin Lyu,Haojian Huang,Chang Sun,Zhihong Zhu,Lin Gui,Ruichu Cai,Erik Cambria,Hasti Seifi
2024-10-30
Abstract:Multimodal affective computing (MAC) has garnered increasing attention due to its broad applications in analyzing human behaviors and intentions, especially in text-dominated multimodal affective computing field. This survey presents the recent trends of multimodal affective computing from NLP perspective through four hot tasks: multimodal sentiment analysis, multimodal emotion recognition in conversation, multimodal aspect-based sentiment analysis and multimodal multi-label emotion recognition. The goal of this survey is to explore the current landscape of multimodal affective research, identify development trends, and highlight the similarities and differences across various tasks, offering a comprehensive report on the recent progress in multimodal affective computing from an NLP perspective. This survey covers the formalization of tasks, provides an overview of relevant works, describes benchmark datasets, and details the evaluation metrics for each task. Additionally, it briefly discusses research in multimodal affective computing involving facial expressions, acoustic signals, physiological signals, and emotion causes. Additionally, we discuss the technical approaches, challenges, and future directions in multimodal affective computing. To support further research, we released a repository that compiles related works in multimodal affective computing, providing detailed resources and references for the community.
Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the review and analysis in the field of multimodal sentiment computing. Specifically, from the perspective of natural language processing (NLP), the paper explores the latest trends in multimodal sentiment computing, focusing on four popular tasks: Multimodal Sentiment Analysis (MSA), Multimodal Emotion Recognition in Conversation (MERC), Multimodal Aspect - Based Sentiment Analysis (MABSA) and Multimodal Multi - Label Emotion Recognition (MMER). The goal of the paper is to explore the panorama of current multimodal sentiment research, identify development trends, and highlight the similarities and differences among these tasks, providing a comprehensive report in the field of multimodal sentiment computing. The paper also covers the formal description of tasks, an overview of related work, a description of benchmark datasets, and a detailed explanation of evaluation metrics for each task. In addition, the paper briefly discusses multimodal sentiment computing research involving facial expressions, acoustic signals, physiological signals, and emotion causes, as well as technical methods, challenges, and future directions. Finally, the author releases a repository, compiling related work on multimodal sentiment computing, providing detailed resources and references for the community. In this way, this paper aims to provide beginners with a comprehensive overview of multimodal sentiment computing, and also provides researchers with an opportunity to reflect on past developments, explore future trends and technical methods, especially in multimodal sentiment analysis and emotion recognition.