Abstract:Driving can take up a substantial part of daily life and frequently trigger negative emotions like anger or anxiety, which have a significant adverse impact on driving safety as well as long-term human health. To identify driver emotions, thereby improving the safety and humanization of intelligent driving, we explore how to model the discriminative emotion features from both speech and facial expressions in this work. More specifically, an effective attention-based network for facial expression and a lightweight speech emotion network are proposed, separately. Then, audio and video features are combined at the feature level to construct our multimodal driver emotion recognition model. This paper proposes a new audio feature extractor that uses a multi-scale residual structure to extract spectrogram features. In terms of video, a set of frame sequences using Local Binary Pattern Histograms (LBPH) is obtained through preprocessing, which generates a fixed-dimensional feature representation. These features are then input into a fine-tuned ResNet18 model to analyze spatial information. This model is further augmented by integrating both a temporal attention module and a Gated Recurrent Unit (GRU), enhancing its capability to create a highly discriminative video representation. Additionally, we propose an Internet of Vehicles (IoV) platform, specifically designed for driver emotion recognition. The IoV platform consists of sensor layer, data acquisition and transport layer, server layer and data application layer. The IoV platform uses sensors to collect multimodal data from drivers, which can provide data support for the proposed multimodal driver emotion recognition algorithm. The performance of this proposed algorithm is evaluated on two multimodal emotional datasets, Ryerson Audio-Visual Dataset of Emotional Speech and Song (RAVDESS) and Surrey Audio-Visual Expressed Emotion (SAVEE), using a variety of performance indicators. Compared to other baseline methods, this proposed multimodal model achieves state-of-the-art results on the RAVDESS and SAVEE datasets, demonstrating superior recognition accuracy with rates of 0.93 and 0.99, respectively. Additionally, it exhibits precision scores of 0.93 on RAVDESS and 0.99 on SAVEE, along with exceptional specificity scores of 0.99 and 1.00, respectively.

A multimodal psychological, physiological and behavioural dataset for human emotions in driving tasks

A Spontaneous Driver Emotion Facial Expression (DEFE) Dataset for Intelligent Vehicles

A multimodal physiological dataset for driving behaviour analysis

Multimodal Dataset Construction and Validation for Driving-Related Anger: A Wearable Physiological Conduction and Vehicle Driving Data Approach

Discovering the Power of External Human-Machine Interface: an EEG Study on the Driving Anger Regulation

Multimodal Data Collection System for Driver Emotion Recognition Based on Self-Reporting in Real-World Driving

A multimodal dataset for various forms of distracted driving

Drivers' Comprehensive Emotion Recognition Based on HAM

OpenDriver: An Open-Road Driver State Detection Dataset

A Multimodal Dataset for Mixed Emotion Recognition

A dataset of daily ambulatory psychological and physiological recording for emotion research

Driver Emotion Recognition Involving Multimodal Signals: Electrophysiological Response, Nasal-Tip Temperature, and Vehicle Behavior

Analyze Spontaneous Gestures for Emotional Stress State Recognition: A Micro-gesture Dataset and Analysis with Deep Learning

A uniform human multimodal dataset for emotion perception and judgment

An Experimental Study on Physiological Parameters Toward Driver Emotion Recognition

A Multimodal Driver Emotion Recognition Algorithm Based on the Audio and Video Signals in Internet of Vehicles Platform

Feature Extraction and Dynamic Identification of Drivers’ Emotions

Performance Evaluation of Intelligent Driving Emotion Recognition Model based on Synthetic Dataset in Real Scenes

Feature Extraction and Dynamic Identification of Driving Intention Adapting to Multi-Mode Emotions

MPED: A Multi-Modal Physiological Emotion Database for Discrete Emotion Recognition

Investigating emotion fluctuations in driving behaviors of online car-hailing drivers using naturalistic driving data