Abstract:Human-level driving is an ultimate goal of autonomous driving. Conventional approaches formulate autonomous driving as a perception-prediction-planning framework, yet their systems do not capitalize on the inherent reasoning ability and experiential knowledge of humans. In this paper, we propose a fundamental paradigm shift from current pipelines, exploiting Large Language Models (LLMs) as a cognitive agent to integrate human-like intelligence into autonomous driving systems. Our approach, termed Agent-Driver, transforms the traditional autonomous driving pipeline by introducing a versatile tool library accessible via function calls, a cognitive memory of common sense and experiential knowledge for decision-making, and a reasoning engine capable of chain-of-thought reasoning, task planning, motion planning, and self-reflection. Powered by LLMs, our Agent-Driver is endowed with intuitive common sense and robust reasoning capabilities, thus enabling a more nuanced, human-like approach to autonomous driving. We evaluate our approach on the large-scale nuScenes benchmark, and extensive experiments substantiate that our Agent-Driver significantly outperforms the state-of-the-art driving methods by a large margin. Our approach also demonstrates superior interpretability and few-shot learning ability to these methods.

What problem does this paper attempt to address?

The problem this paper attempts to address is the effective integration of human prior knowledge and reasoning abilities into autonomous driving systems to achieve more human-like and efficient autonomous driving. Specifically, traditional autonomous driving systems typically adopt a Perception-Prediction-Planning framework, but these systems lack the reasoning abilities and experiential knowledge of human drivers, leading to poor performance in handling complex driving scenarios. For example, when seeing a ball on the road, a human driver would instinctively anticipate that a child might be chasing it and would slow down accordingly, whereas a traditional autonomous driving system might continue driving until the sensors detect the child, leaving a smaller safety margin. To address these issues, the paper proposes a new approach called Agent-Driver, which leverages large language models (LLMs) as cognitive agents to incorporate human prior knowledge and reasoning abilities into autonomous driving systems. The main contributions of Agent-Driver include: 1. **Introduction of a toolkit**: Extracting necessary environmental information from neural modules through dynamic function calls, reducing redundancy. 2. **Construction of cognitive memory**: Storing common sense and driving experience to enhance the system's decision-making capabilities. 3. **Design of a reasoning engine**: Combining environmental information and memory data to perform chain reasoning, task planning, motion planning, and self-reflection, generating safe and comfortable driving trajectories. Experimental results show that Agent-Driver significantly outperforms existing autonomous driving methods in multiple benchmarks, particularly on the nuScenes dataset, where the collision rate is reduced by over 30%. Additionally, this approach demonstrates strong few-shot learning capabilities and interpretability.

A Language Agent for Autonomous Driving

Drive Like a Human: Rethinking Autonomous Driving with Large Language Models

Receive, Reason, and React: Drive as You Say, With Large Language Models in Autonomous Vehicles

Receive, Reason, and React: Drive as You Say with Large Language Models in Autonomous Vehicles

AgentsCoDriver: Large Language Model Empowered Collaborative Driving with Lifelong Learning

DriveLLM: Charting the Path Toward Full Autonomous Driving with Large Language Models

SurrealDriver: Designing LLM-powered Generative Driver Agent Framework based on Human Drivers' Driving-thinking Data

LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving

Drive as You Speak: Enabling Human-Like Interaction with Large Language Models in Autonomous Vehicles

Large Language Models for Autonomous Driving (LLM4AD): Concept, Benchmark, Simulation, and Real-Vehicle Experiment

SurrealDriver: Designing Generative Driver Agent Simulation Framework in Urban Contexts based on Large Language Model

Personalized Autonomous Driving with Large Language Models: Field Experiments

DiLu: A Knowledge-Driven Approach to Autonomous Driving with Large Language Models

Empowering Autonomous Driving with Large Language Models: A Safety Perspective

LLM4Drive: A Survey of Large Language Models for Autonomous Driving

OmniDrive: A Holistic LLM-Agent Framework for Autonomous Driving with 3D Perception, Reasoning and Planning

DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving

Driving Everywhere with Large Language Model Policy Adaptation

Facilitating Autonomous Driving Tasks with Large Language Models

Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving

ADriver-I: A General World Model for Autonomous Driving