Multimodal Transformer Fusion for Emotion Recognition: A Survey

R. Séguier,Amdjed Belaref
DOI: https://doi.org/10.1109/ICNLP60986.2024.10692953
2024-03-22
Abstract:Recently, Transformer-based models have gained popularity due to their ability to effectively model sequential data, handle long-term dependencies, and manage large amounts of data [1]. These models are at the forefront of advancements in many fields, notably in emotion recognition, the center of affective computing [2]. Transformers provide a powerful tool for the nuanced understanding of human emotions through the fusion of multiple modalities. This survey aims to explore and propose a classification scheme for the growing research field of multimodal emotion recognition using Transformer models. It presents an overview of the recent advancements in the application of Transformer-based architectures and their fusion techniques to analyze and interpret emotions from various modalities. The survey also covers different challenges faced in this domain and how they are tackled by the Transformers.
Computer Science
What problem does this paper attempt to address?