Abstract:We propose UAD, a method for vision-based end-to-end autonomous driving (E2EAD), achieving the best open-loop evaluation performance in nuScenes, meanwhile showing robust closed-loop driving quality in CARLA. Our motivation stems from the observation that current E2EAD models still mimic the modular architecture in typical driving stacks, with carefully designed supervised perception and prediction subtasks to provide environment information for oriented planning. Although achieving groundbreaking progress, such design has certain drawbacks: 1) preceding subtasks require massive high-quality 3D annotations as supervision, posing a significant impediment to scaling the training data; 2) each submodule entails substantial computation overhead in both training and inference. To this end, we propose UAD, an E2EAD framework with an unsupervised proxy to address all these issues. Firstly, we design a novel Angular Perception Pretext to eliminate the annotation requirement. The pretext models the driving scene by predicting the angular-wise spatial objectness and temporal dynamics, without manual annotation. Secondly, a self-supervised training strategy, which learns the consistency of the predicted trajectories under different augment views, is proposed to enhance the planning robustness in steering scenarios. Our UAD achieves 38.7% relative improvements over UniAD on the average collision rate in nuScenes and surpasses VAD for 41.32 points on the driving score in CARLA's Town05 Long benchmark. Moreover, the proposed method only consumes 44.3% training resources of UniAD and runs 3.4 times faster in inference. Our innovative design not only for the first time demonstrates unarguable performance advantages over supervised counterparts, but also enjoys unprecedented efficiency in data, training, and inference. Code and models will be released at <a class="link-external link-https" href="https://github.com/KargoBot_Research/UAD" rel="external noopener nofollow">this https URL</a>.

VADv2: End-to-End Vectorized Autonomous Driving via Probabilistic Planning

VADv2: End-to-End Vectorized Autonomous Driving via Probabilistic Planning

VAD: Vectorized Scene Representation for Efficient Autonomous Driving

GenAD: Generative End-to-End Autonomous Driving

End-to-End Autonomous Driving without Costly Modularization and 3D Manual Annotation

Probabilistic End-to-End Vehicle Navigation in Complex Dynamic Environments with Multimodal Sensor Fusion

HE-Drive: Human-Like End-to-End Driving with Vision Language Models

Think Twice Before Driving: Towards Scalable Decoders for End-to-End Autonomous Driving

Safe and Generalized end-to-end Autonomous Driving System with Reinforcement Learning and Demonstrations

PPAD: Iterative Interactions of Prediction and Planning for End-to-end Autonomous Driving

End-to-End Autonomous Driving With Semantic Depth Cloud Mapping and Multi-Agent

SparseDrive: End-to-End Autonomous Driving via Sparse Scene Representation

Efficient and Generalized End-to-end Autonomous Driving System with Latent Deep Reinforcement Learning and Demonstrations

Planning by Simulation: Motion Planning with Learning-based Parallel Scenario Prediction for Autonomous Driving

VLP: Vision Language Planning for Autonomous Driving

VWP:An Efficient DRL-Based Autonomous Driving Model

Integrating Decision-Making Into Differentiable Optimization Guided Learning for End-to-End Planning of Autonomous Vehicles

Parallel Planning:A New Motion Planning Framework for Autonomous Driving

BEVGPT: Generative Pre-trained Large Model for Autonomous Driving Prediction, Decision-Making, and Planning