Abstract:This paper presents a novel approach for emotion recognition (ER) based on Electroencephalogram (EEG), Electromyogram (EMG), Electrocardiogram (ECG), and computer vision. The proposed system includes two different models for physiological signals and facial expressions deployed in a real-time embedded system. A custom dataset for EEG, ECG, EMG, and facial expression was collected from 10 participants using an Affective Video Response System. Time, frequency, and wavelet domain-specific features were extracted and optimized, based on their Visualizations from Exploratory Data Analysis (EDA) and Principal Component Analysis (PCA). Local Binary Patterns (LBP), Local Ternary Patterns (LTP), Histogram of Oriented Gradients (HOG), and Gabor descriptors were used for differentiating facial emotions. Classification models, namely decision tree, random forest, and optimized variants thereof, were trained using these features. The optimized Random Forest model achieved an accuracy of 84%, while the optimized Decision Tree achieved 76% for the physiological signal-based model. The facial emotion recognition (FER) model attained an accuracy of 84.6%, 74.3%, 67%, and 64.5% using K-Nearest Neighbors (KNN), Random Forest, Decision Tree, and XGBoost, respectively. Performance metrics, including Area Under Curve (AUC), F1 score, and Receiver Operating Characteristic Curve (ROC), were computed to evaluate the models. The outcome of both results, i.e., the fusion of bio-signals and facial emotion analysis, is given to a voting classifier to get the final emotion. A comprehensive report is generated using the Generative Pretrained Transformer (GPT) language model based on the resultant emotion, achieving an accuracy of 87.5%. The model was implemented and deployed on a Jetson Nano. The results show its relevance to ER. It has applications in enhancing prosthetic systems and other medical fields such as psychological therapy, rehabilitation, assisting individuals with neurological disorders, mental health monitoring, and biometric security.

Multimodal Emotion Recognition Using Different Fusion Techniques

Emotion Recognition in Videos via Fusing Multimodal Features.

Multimodal Emotion Recognition by Combining Physiological Signals and Facial Expressions: a Preliminary Study.

Multimodal fusion framework: A multiresolution approach for emotion classification and recognition from physiological signals

Multimodal emotion recognition model via hybrid model with improved feature level fusion on facial and EEG feature set

Multimodal Emotion Recognition Framework Using a Decision-Level Fusion and Feature-Level Fusion Approach

Multimodal Emotion Recognition Based on Feature Fusion.

Multimodal Emotion Recognition System Using Machine Learning Classifier

Multi-Modal Emotion Recognition Feature Extraction and Data Fusion Methods Evaluation

Multimodal modelling of human emotion using sound, image and text fusion

Multimodal emotion recognition using cross modal audio-video fusion with attention and deep metric learning

Multimodal Emotion Recognition based on the Fusion of EEG Signals and Eye Movement Data

Feature-level fusion of multimodal physiological signals for emotion recognition

Emotion recognition framework using multiple modalities for an effective human–computer interaction

Investigating Multisensory Integration in Emotion Recognition Through Bio-Inspired Computational Models

Interpretable Multimodal Emotion Recognition using Hybrid Fusion of Speech and Image Data

Multimodal emotion recognition based on a fusion of audiovisual information with temporal dynamics

Multimodal Emotion Detection via Attention-Based Fusion of Extracted Facial and Speech Features

MF-Net: a multimodal fusion network for emotion recognition based on multiple physiological signals

Multimodal emotion recognition based on the fusion of vision, EEG, ECG, and EMG signals

Multimodal Emotion Recognition Based on Cascaded Multichannel and Hierarchical Fusion