Fine-grained Dual-space Context Analysis Network for Multimodal Emotion Recognition

Jinghao Li,Jiajia Tang,Wanzeng Kong,Yanyan Ying
DOI: https://doi.org/10.1109/ijcnn60899.2024.10651517
2024-01-01
Abstract:Multimodal emotion recognition has received widespread attention in variety of domains, which can utilize multiple modalities of emotional information to improve the performance of emotion recognition. However, existing coarse-grained works mainly attend to the temporal domain context and totally ignore the temporal-spatial domain context, which results in the significant deterioration of the emotion analysis performance. In this work, the fine-grained dual-space context analysis network (FDCAN) is proposed to fully investigate the fine-grained emotion context among the joint temporal-spatial emotion representative space. Specifically, the depthwise separable convolution based operation is leveraged to exploit the temporal and spatial space from EEG and EOG modality. Furthermore, the correlation analysis based technique is introduced to investigate cross-modality correlation messages from the above obtained spaces, leading to the coupled temporal-spatial representative space. Additionally, the attention mechanism based procedure is performed to deal with the comprehensive and sophisticated multi-modality emotion context from the coupled representative space. Note that, the above carefully designed hierarchical dual-space emotion context procedure indeed provide a new detection and bears the strong potential to facilitate the emotion analysis. We validated the effectiveness of the proposed framework on the popular and public multimodal emotion analysis benchmark DEAP. The experimental results demonstrated that our model achieved better performance of 97.4% and 96.9% for binary and four-class emotion classification task.
What problem does this paper attempt to address?