Emotion recognition using heterogeneous convolutional neural networks combined with multimodal factorized bilinear pooling

Yong Zhang,Cheng Cheng,Shuai Wang,Tianqi Xia
DOI: https://doi.org/10.1016/j.bspc.2022.103877
IF: 5.1
2022-08-01
Biomedical Signal Processing and Control
Abstract:Multimodal emotion recognition is one of the challenging topics in the field of knowledge-based systems and many methods have been studied successfully. Nevertheless, multimodal emotion recognition needs effective fusion representations of multimodal domains, and available methods still have problems on this challenging task. In view of this, this paper proposes a new deep learning model for emotion recognition based on heterogeneous convolutional neural networks (HCNNs) and multimodal factorized bilinear pooling (MFB). In the proposed model, firstly, we select the channels of electroencephalogram (EEG) signals to reduce the interference caused by the redundant channels. Secondly, the HCNNs extract the convolutional features of each modality, and then the MFB method fuses the deep convolution features of the different modalities. Finally, the ensembled strategy is used to verify the model proposed in this paper and explore the influence of various bands on the experiment. The proposed method allows all elements of each component to effectively contact with each other to express the complex internal relationship of each component modality. The experimental results show that the best average result of our proposed method achieves the accuracy of 91.84% on DEAP dataset and 90.17% on MAHNOB-HCI dataset, which proves that the proposed method can improve the performance of multimodal emotion recognition and significantly outperform the state-of-the-art.
engineering, biomedical
What problem does this paper attempt to address?