Remote Photoplethysmography and Motion Tracking Convolutional Neural Network with Bidirectional Long Short-Term Memory: Non-Invasive Fatigue Detection Method Based on Multi-Modal Fusion

Lingjian Kong,Kai Xie,Kaixuan Niu,Jianbiao He,Wei Zhang
DOI: https://doi.org/10.3390/s24020455
IF: 3.9
2024-01-12
Sensors
Abstract:Existing vision-based fatigue detection methods commonly utilize RGB cameras to extract facial and physiological features for monitoring driver fatigue. These features often include single indicators such as eyelid movement, yawning frequency, and heart rate. However, the accuracy of RGB cameras can be affected by factors like varying lighting conditions and motion. To address these challenges, we propose a non-invasive method for multi-modal fusion fatigue detection called RPPMT-CNN-BiLSTM. This method incorporates a feature extraction enhancement module based on the improved Panā€“Tompkins algorithm and 1D-MTCNN. This enhances the accuracy of heart rate signal extraction and eyelid features. Furthermore, we use one-dimensional neural networks to construct two models based on heart rate and PERCLOS values, forming a fatigue detection model. To enhance the robustness and accuracy of fatigue detection, the trained model data results are input into the BiLSTM network. This generates a time-fitting relationship between the data extracted from the CNN, allowing for effective dynamic modeling and achieving multi-modal fusion fatigue detection. Numerous experiments validate the effectiveness of the proposed method, achieving an accuracy of 98.2% on the self-made MDAD (Multi-Modal Driver Alertness Dataset). This underscores the feasibility of the algorithm. In comparison with traditional methods, our approach demonstrates higher accuracy and positively contributes to maintaining traffic safety, thereby advancing the field of smart transportation.
engineering, electrical & electronic,chemistry, analytical,instruments & instrumentation
What problem does this paper attempt to address?
The paper proposes a solution to the inaccurate detection of driver fatigue. Existing visual-based fatigue detection methods mainly extract facial and physiological features using RGB cameras, but their accuracy is limited due to changes in lighting conditions and motion interference. To address this, the paper presents a non-intrusive multimodal fusion fatigue detection method called RPPMT-CNN-BiLSTM. This method combines an improved Pan-Tompkins algorithm and 1D-MTCNN to enhance the extraction accuracy of heart rate signals and eyelid features. A 1D neural network is used to construct a fatigue detection model based on heart rate and PERCLOS values, and the training data is input into a bidirectional long short-term memory (BiLSTM) network to establish a temporal relationship model and achieve multimodal fusion fatigue detection. Experimental results show that this method achieves an accuracy of 98.2% on the self-built MDAD dataset, demonstrating higher accuracy compared to traditional methods. It contributes to improving traffic safety and promoting the development of intelligent transportation.