Multimodal driver distraction detection using dual-channel network of CNN and Transformer

Luntian Mou,Jiali Chang,Chao Zhou,Yiyuan Zhao,Nan Ma,Baocai Yin,Ramesh Jain,Wen Gao
DOI: https://doi.org/10.1016/j.eswa.2023.121066
IF: 8.5
2023-08-13
Expert Systems with Applications
Abstract:Distracted driving has become one of the main contributors to traffic accidents. It is therefore of great interest for intelligent vehicles to establish a distraction detection system that can continuously monitor driver behavior and respond accordingly. Although significant progress has been made in the existing research, most of them focus on extracting either local features or global features while ignoring the other one. To make full use of both local features and global features, we integrate multi-source perception information and propose a novel dual-channel feature extraction model based on CNN and Transformer. In order to improve the model's fitting ability to time series data, the CNN channel and Transformer channel are modeled separately using the mid-point residual structure. The scaling factors in the residual structure are regarded as hyperparameters, and a penalized validation method based on bilevel optimization is introduced to obtain the optimal values automatically. Extensive experiments and comparison with the state-of-the-art methods on a multimodal dataset of driver distraction validate the effectiveness of the proposed method.
computer science, artificial intelligence,engineering, electrical & electronic,operations research & management science
What problem does this paper attempt to address?