Abstract:In recent years, autonomous vehicles have attracted many researchers from academia and industry. Their efforts include object detection and tracking in the fields of computer vision, probabilistic-fusion and decision-making algorithms, etc. But most current methods that directly map from image pixels to steering behavior are not able to generate accurate results, especially under scenarios such as light and weather change drastically; this is largely due to the fact that these methods ignore the temporal relationship between frames. In this paper, we propose a novel end-to-end deep learning framework based on temporal and spatial attention mechanism, which aims to solve the problem of inaccurate vehicle steering angle prediction in complex environments and the difficulty of model interpretation. First, we use video sequence and historical steering angle sequence as inputs to the model, instead of just using a single frame as input. Second, we design a temporal attention mechanism to capture the long- and short-term memory in input visual information, and a spatial attention mechanism to capture key objects in the image and obtain their position information. This is achieved by inserting carefully-designed SE-Net, ConvLSTM and CNN layers into the appropriate layers of the network framework. Finally, we demonstrate the feasibility of the proposed model with public Comma2k19, with comparison to current advanced methods. Experimental results show that compared with state-of-the-art methods, the average absolute error (MAE) values of our model on the training set and testing set are reduced by 10.2% and 6.3%, respectively, and has more accurate steering prediction performance. In addition, we explain the trigger mechanism of steering behavior prediction by visualizing the spatial attention map and temporal attention score on Comma2k19 and Udacity datasets, which further demonstrates that the proposed model can learn human-like driving behavior.

Jointy Predicting Future Sequence And Steering Angles For Dynamic Driving Scenes

An End-to-End Future Frame Prediction Method for Vehicle-Centric Driving Videos

A Novel Generation-Adversarial-Network-Based Vehicle Trajectory Prediction Method for Intelligent Vehicular Networks

Multimodal Vehicle Trajectory Prediction Based on Graph Convolutional Networks

Deep learning-based vehicle trajectory prediction based on generative adversarial network for autonomous driving applications

Multi-modal Vehicle Trajectory Prediction Via Attention-based Conditional Variational Autoencoder

A Novel End-to-End Model for Steering Behavior Prediction of Autonomous Ego-Vehicles Using Spatial and Temporal Attention Mechanism

Context-Aware Timewise VAEs for Real-Time Vehicle Trajectory Prediction

CAE‐GAN: A hybrid model for vehicle trajectory prediction

GenAD: Generative End-to-End Autonomous Driving

Dual Motion GAN for Future-Flow Embedded Video Prediction

Probabilistic Future Prediction for Video Scene Understanding

Edge Guided Generation Network for Video Prediction

Human Insights Driven Latent Space for Different Driving Perspectives: A Unified Encoder for Efficient Multi-Task Inference

Vehicle Trajectory Prediction Using Intention-based Conditional Variational Autoencoder

SA-SGAN - A Vehicle Trajectory Prediction Model Based on Generative Adversarial Networks.

Visual-Angle Attention Predictor: A Multi-Agent Trajectory Predictor Based on Variational Auto-Encoder

Map-enhanced Generative Adversarial Trajectory Prediction Method for Automated Vehicles

Adaptive Visual Interaction Based Multi-Target Future State Prediction For Autonomous Driving Vehicles

Deep Steering: Learning End-to-End Driving Model from Spatial and Temporal Visual Cues