Abstract:In recent years, autonomous driving algorithms using low-cost vehicle-mounted cameras have attracted increasing endeavors from both academia and industry. There are multiple fronts to these endeavors, including object detection on roads, 3-D reconstruction etc., but in this work we focus on a vision-based model that directly maps raw input images to steering angles using deep networks. This represents a nascent research topic in computer vision. The technical contributions of this work are three-fold. First, the model is learned and evaluated on real human driving videos that are time-synchronized with other vehicle sensors. This differs from many prior models trained from synthetic data in racing games. Second, state-of-the-art models, such as PilotNet, mostly predict the wheel angles independently on each video frame, which contradicts common understanding of driving as a stateful process. Instead, our proposed model strikes a combination of spatial and temporal cues, jointly investigating instantaneous monocular camera observations and vehicle's historical states. This is in practice accomplished by inserting carefully-designed recurrent units (e.g., LSTM and Conv-LSTM) at proper network layers. Third, to facilitate the interpretability of the learned model, we utilize a visual back-propagation scheme for discovering and visualizing image regions crucially influencing the final steering prediction. Our experimental study is based on about 6 hours of human driving data provided by Udacity. Comprehensive quantitative evaluations demonstrate the effectiveness and robustness of our model, even under scenarios like drastic lighting changes and abrupt turning. The comparison with other state-of-the-art models clearly reveals its superior performance in predicting the due wheel angle for a self-driving car.

PilotAttnNet: Multi-modal Attention Network for End-to-End Steering Control.

End-to-end Multi-Modal Multi-Task Vehicle Control for Self-Driving Cars with Visual Perception

Incorporating Orientations into End-to-end Driving Model for Steering Control

A LiDAR Based End to End Controller for Robot Navigation Using Deep Neural Network

Deep Steering: Learning End-to-End Driving Model from Spatial and Temporal Visual Cues

Learning End-to-End Autonomous Steering Model from Spatial and Temporal Visual Cues

ST-P3: End-to-end Vision-based Autonomous Driving via Spatial-Temporal Feature Learning

Human Insights Driven Latent Space for Different Driving Perspectives: A Unified Encoder for Efficient Multi-Task Inference

Autonomous Driving with Human Guided Image Feature Extraction

Multimodal End-to-End Learning for Autonomous Steering in Adverse Road and Weather Conditions

Deep learning and control algorithms of direct perception for autonomous driving

E-DNet: An End-to-End Dual-Branch Network for Driver Steering Intention Detection

Enhancing Accuracy and Robustness of Steering Angle Prediction with Attention Mechanism

Brain Inspired Cognitive Model with Attention for Self-Driving Cars

Learning On-Road Visual Control for Self-Driving Vehicles with Auxiliary Tasks

FlowDriveNet: an End-to-End Network for Learning Driving Policies from Image Optical Flow and LiDAR Point Flow

Trajectory-guided Control Prediction for End-to-end Autonomous Driving: A Simple yet Strong Baseline

Aggregated Sparse Attention for Steering Angle Prediction

Perceive, Attend, and Drive: Learning Spatial Attention for Safe Self-Driving