Driver stress detection via multimodal fusion using attention-based CNN-LSTM

Luntian Mou,Chao Zhou,Pengfei Zhao,Bahareh Nakisa,Mohammad Naim Rastgoo,Ramesh Jain,Wen Gao
DOI: https://doi.org/10.1016/j.eswa.2021.114693
IF: 8.5
2021-07-01
Expert Systems with Applications
Abstract:<p>Stress has been identified as one of major contributing factors in car crashes due to its negative impact on driving performance. It is in urgent need that the stress levels of drivers can be detected in real time with high accuracy so that intervening or navigating measures can be taken in time to mitigate the situation. Existing driver stress detection models mainly rely on traditional machine learning techniques to fuse multimodal data. However, due to the non-linear correlations among modalities, it is still challenging for traditional multimodal fusion methods to handle the real-time influx of complex multimodal and high dimensional data, and report drivers' stress levels accurately. To solve this issue, a framework of driver stress detection through multimodal fusion using attention based deep learning techniques is proposed in this paper. Specifically, an attention based convolutional neural networks (CNN) and long short-term memory (LSTM) model is proposed to fuse non-invasive data, including eye data, vehicle data, and environmental data. Then, the proposed model can automatically extract features separately from each modality and give different levels of attention to features from different modalities through self-attention mechanism. To verify the validity of the proposed method, extensive experiments have been carried out on our dataset collected using an advanced driving simulator. Experimental results demonstrate that the performance of the proposed method on driver stress detection outperforms the state-of-the-art models with an average accuracy of 95.5%.</p>
computer science, artificial intelligence,engineering, electrical & electronic,operations research & management science
What problem does this paper attempt to address?
The paper primarily focuses on addressing the issue of real-time and accurate detection of driver stress levels during driving. To tackle this problem, the researchers propose a method that combines a Convolutional Neural Network (CNN) based on an attention mechanism with a Long Short-Term Memory (LSTM) network to achieve effective fusion of multimodal data. Specifically, this method utilizes non-intrusive data sources such as eye movement data, vehicle dynamics data, and environmental data. The key contributions of the paper include: 1. **Proposing an attention mechanism-based CNN-LSTM network framework**: This framework can automatically extract features and assign different weights to features from different modalities through a self-attention mechanism, thereby improving the accuracy of driver stress level classification. 2. **Proposing a non-intrusive multimodal data combination scheme**: This scheme combines eye movement data, vehicle dynamics data, and environmental data, which are easy to obtain and do not interfere with the driver, making it more suitable for practical deployment. 3. **Conducting extensive experimental validation**: The experimental results show that eye movement data has high potential for driver stress detection, and the attention mechanism-based multimodal fusion model performs better than other models that do not use the attention mechanism, with an average accuracy of 95.5%. In summary, this paper aims to improve existing driver stress detection technology through a novel deep learning method, enhancing its accuracy and practicality.