Emotion Recognition with Pre-Trained Transformers Using Multimodal Signals

Juan Vazquez-Rodriguez,Grégoire Lefebvre,Julien Cumin,James L Crowley
DOI: https://doi.org/10.48550/arXiv.2212.13885
2022-12-22
Abstract:In this paper, we address the problem of multimodal emotion recognition from multiple physiological signals. We demonstrate that a Transformer-based approach is suitable for this task. In addition, we present how such models may be pretrained in a multimodal scenario to improve emotion recognition performances. We evaluate the benefits of using multimodal inputs and pre-training with our approach on a state-ofthe-art dataset.
Signal Processing,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is multi - modal emotion recognition, especially recognizing emotions from multiple physiological signals. Specifically, the author focuses on how to use the Transformer model to process physiological signals (such as electrocardiogram (ECG) and electroencephalogram (EEG)) from wearable devices in order to improve the accuracy of emotion recognition. The main problems mentioned in the paper include: 1. **Insufficient data annotation**: There is a lack of sufficient annotated data in emotion recognition tasks to effectively train deep - learning models. To this end, the author explores unsupervised pre - training techniques to overcome this challenge. 2. **Multi - modal signal fusion**: How to effectively fuse physiological signals of different modalities to improve the performance of emotion recognition. The author proposes a Transformer - based late - fusion method, which pre - trains and fine - tunes single - modal models separately, and then combines the outputs of these models for the final emotion prediction. 3. **Applicability of the Transformer model**: Verify whether the Transformer model is suitable for processing physiological signals and whether its performance in emotion recognition tasks can be improved through pre - training techniques. The main contributions of the paper include: - Proposing a technique for recognizing emotions from multi - modal physiological signals using the Transformer model. - Describing a method for pre - training the Transformer model for recognizing emotions from multi - modal physiological signals. - Experimental results show that the multi - modal pre - training strategy can effectively improve the performance of emotion recognition. Through these studies, the author aims to provide new ideas and technical support for research in the field of emotion recognition, especially in using physiological signals for emotion recognition.