Driver behaviour recognition based on recursive all‐pair field transform time series model
HuiZhi Xu,ZhaoHao Xing,YongShuai Ge,DongSheng Hao,MengYing Chang
DOI: https://doi.org/10.1049/itr2.12528
IF: 2.7
2024-06-19
IET Intelligent Transport Systems
Abstract:The acquisition of the Driver‐vid video optical flow dataset and Driver‐img image optical flow dataset involves Recurrent All‐Pairs Field Transforms dense optical flow analysis, accentuating the semantic aspects of driver actions. The Driver‐img dataset is used to train the feature extraction network MobileNet‐GYY to extract image features from the first 64 frames of the Driver‐vid video dataset (less than 64 frames to extract all frames). Subsequently, a compressed padding method is employed to standardize the data format of the video feature tensor, which is then input to the bidirectional GRU model. Ultimately, the fully connected layer yields ten driver behaviour tensor categories, and the model is optimized using the cross‐entropy loss function. To standardize driver behaviour and enhance transportation system safety, a dynamic driver behaviour recognition method based on the Recurrent All‐Pairs Field Transforms (RAFT) temporal model is proposed. This study involves the creation of two datasets, namely, Driver‐img and Driver‐vid, including driver behaviour images and videos across various scenarios. These datasets are subject to preprocessing using RAFT optical flow techniques to enhance the cognitive process of the network. This approach employs a two‐stage temporal model for driver behaviour recognition. In the initial stage, the MobileNet network is optimized and the GYY module is introduced, which includes residuals and global average pooling layers, thereby enhancing the network's feature extraction capabilities. In the subsequent stage, a bidirectional GRU network is constructed to learn driver behaviour video features with temporal information. Additionally, a method for compressing and padding video frames is proposed, which serves as input to the GRU network and enables intent prediction 0.2 s prior to driver actions. Model performance is assessed through accuracy, recall, and F1 score, with experimental results indicating that RAFT preprocessing enhances accuracy, reduces training time, and improves overall model stability, facilitating the recognition of driver behaviour intent.
engineering, electrical & electronic,transportation science & technology