Abstract:Emotion recognition remains an intricate task at the crossroads of psychology and artificial intelligence, necessitating real-time, accurate discernment of implicit emotional states. Here, we introduce a pioneering wearable dual-modal device, synergizing functional near-infrared spectroscopy (fNIRS) and electroencephalography (EEG) to meet this demand. The first-of-its-kind fNIRS-EEG ensemble exploits a temporal convolutional network (TC-ResNet) that takes 24 fNIRS and 16 EEG channels as input for the extraction and recognition of emotional features. Our system has many advantages including its portability, battery efficiency, wireless capabilities, and scalable architecture. It offers a real-time visual interface for the observation of cerebral electrical and hemodynamic changes, tailored for a variety of real-world scenarios. Our approach is a comprehensive emotional detection strategy, with new designs in system architecture and deployment and improvement in signal processing and interpretation. We examine the interplay of emotions and physiological responses to elucidate the cognitive processes of emotion regulation. An extensive evaluation of 30 subjects under four emotion induction protocols demonstrates our bimodal system's excellence in detecting emotions, with an impressive classification accuracy of 99.81% and its ability to reveal the interconnection between fNIRS and EEG signals. Compared with the latest unimodal identification methods, our bimodal approach shows significant accuracy gains of 0.24% for EEG and 8.37% for fNIRS. Moreover, our proposed TC-ResNet-driven temporal convolutional fusion technique outperforms conventional EEG-fNIRS fusion methods, improving the recognition accuracy from 0.7% to 32.98%. This research presents a groundbreaking advancement in affective computing that combines biological engineering and artificial intelligence. Our integrated solution facilitates nuanced and responsive affective intelligence in practical applications, with far-reaching impacts on personalized healthcare, education, and human–computer interaction paradigms.

CFN-ESA: A Cross-Modal Fusion Network with Emotion-Shift Awareness for Dialogue Emotion Recognition.

MFDR: Multiple-stage Fusion and Dynamically Refined Network for Multimodal Emotion Recognition

Enhancing Emotion Recognition in Conversation through Emotional Cross-Modal Fusion and Inter-class Contrastive Learning

MM-DFN: Multimodal Dynamic Fusion Network for Emotion Recognition in Conversations

Speaker-aware cognitive network with cross-modal attention for multimodal emotion recognition in conversation

A Cross-Modal Fusion Network Based on Self-Attention and Residual Structure for Multimodal Emotion Recognition

Temporal Convolutional Network-Enhanced Real-Time Implicit Emotion Recognition with an Innovative Wearable fNIRS-EEG Dual-Modal System

A Contextual Attention Network for Multimodal Emotion Recognition in Conversation

MultiEMO: an Attention-Based Correlation-Aware Multimodal Fusion Framework for Emotion Recognition in Conversations.

Speaker-aware Cross-modal Fusion Architecture for Conversational Emotion Recognition

M2FNet: Multi-modal Fusion Network for Emotion Recognition in Conversation

MF-Net: a multimodal fusion network for emotion recognition based on multiple physiological signals

A Multi-Level Alignment and Cross-Modal Unified Semantic Graph Refinement Network for Conversational Emotion Recognition

Multimodal Prompt Transformer with Hybrid Contrastive Learning for Emotion Recognition in Conversation

Emotion recognition in conversations with emotion shift detection based on multi-task learning

Multimodal Knowledge-enhanced Interactive Network with Mixed Contrastive Learning for Emotion Recognition in Conversation

GCF2-Net: global-aware cross-modal feature fusion network for speech emotion recognition

MFGCN: Multimodal fusion graph convolutional network for speech emotion recognition

Multimodal Emotion Recognition Based on Cascaded Multichannel and Hierarchical Fusion

A novel feature fusion network for multimodal emotion recognition from EEG and eye movement signals

Dialogue emotion model based on local–global context encoder and commonsense knowledge fusion attention