MES-CTNet: A Novel Capsule Transformer Network Base on a Multi-Domain Feature Map for Electroencephalogram-Based Emotion Recognition

Yuxiao Du,Han Ding,Min Wu,Feng Chen,Ziman Cai
DOI: https://doi.org/10.3390/brainsci14040344
IF: 3.333
2024-03-31
Brain Sciences
Abstract:Emotion recognition using the electroencephalogram (EEG) has garnered significant attention within the realm of human–computer interaction due to the wealth of genuine emotional data stored in EEG signals. However, traditional emotion recognition methods are deficient in mining the connection between multi-domain features and fitting their advantages. In this paper, we propose a novel capsule Transformer network based on a multi-domain feature for EEG-based emotion recognition, referred to as MES-CTNet. The model's core consists of a multichannel capsule neural network(CapsNet) embedded with ECA (Efficient Channel Attention) and SE (Squeeze and Excitation) blocks and a Transformer-based temporal coding layer. Firstly, a multi-domain feature map is constructed by combining the space–frequency–time characteristics of the multi-domain features as inputs to the model. Then, the local emotion features are extracted from the multi-domain feature maps by the improved CapsNet. Finally, the Transformer-based temporal coding layer is utilized to globally perceive the emotion feature information of the continuous time slices to obtain a final emotion state. The paper fully experimented on two standard datasets with different emotion labels, the DEAP and SEED datasets. On the DEAP dataset, MES-CTNet achieved an average accuracy of 98.31% in the valence dimension and 98.28% in the arousal dimension; it achieved 94.91% for the cross-session task on the SEED dataset, demonstrating superior performance compared to traditional EEG emotion recognition methods. The MES-CTNet method, utilizing a multi-domain feature map as proposed herein, offers a broader observation perspective for EEG-based emotion recognition. It significantly enhances the classification recognition rate, thereby holding considerable theoretical and practical value in the EEG emotion recognition domain.
neurosciences
What problem does this paper attempt to address?
The paper aims to address the problem of emotion recognition based on electroencephalogram (EEG) and proposes a new method to improve the accuracy of emotion recognition. Specifically, the researchers developed a novel capsule Transformer network named MES-CTNet to address the shortcomings of traditional emotion recognition methods in exploring the connections and advantages between multi-domain features. The core contributions of MES-CTNet are: 1. **Proposing a novel capsule Transformer network based on multi-domain feature maps**: This network combines the spatial perception capability of Capsule Networks (CapsNet) and the global semantic analysis capability of Transformers to fully utilize the multi-domain feature information in EEG signals. To enhance the ability of the capsule network to extract effective spatial features and suppress ineffective features, the researchers embedded Efficient Channel Attention (ECA) modules and Squeeze-and-Excitation (SE) modules in CapsNet. 2. **Constructing a new multi-domain feature map**: By integrating the spatial, frequency band, and temporal characteristics of EEG signals, a simple multi-domain feature map structure was constructed, enabling the model to more comprehensively focus on multi-level feature information, thereby improving recognition accuracy. 3. **Outstanding performance on standard datasets**: Extensive experiments were conducted on two widely recognized emotion recognition datasets, DEAP and SEED. The results show that MES-CTNet achieved an average accuracy of 98.31% (valence dimension) and 98.28% (arousal dimension) on the binary classification tasks of the DEAP dataset, and an average accuracy of 94.91% on the three-class classification task of the SEED dataset, demonstrating significant advantages over traditional EEG emotion recognition methods. In summary, this paper addresses some limitations of existing EEG emotion recognition methods by introducing a novel network architecture and feature processing approach, providing valuable references and advancements for research in this field.