Abstract:Artificial intelligence (AI) plays a crucial role in autonomous driving (AD) research, propelling its development towards intelligence and efficiency. Currently, the development of AD technology follows two main technical paths: modularization and end-to-end. Modularization decompose the driving task into modules such as perception, prediction, planning, and control, and train them separately. Due to the inconsistency of training objectives between modules, the integrated effect suffers from bias. End-to-end attempts to address this issue by utilizing a single model that directly maps from sensor data to control signals. This path has limited learning capabilities in a comprehensive set of features and struggles to handle unpredictable long-tail events and complex urban traffic scenarios. In the face of challenges encountered in both paths, many researchers believe that large language models (LLMs) with powerful reasoning capabilities and extensive knowledge understanding may be the solution, expecting LLMs to provide AD systems with deeper levels of understanding and decision-making capabilities. In light of the challenges faced by both paths, many researchers believe that LLMs, with their powerful reasoning abilities and extensive knowledge, could offer a solution. To understand if LLMs could enhance AD, this paper conducts a thorough analysis of the potential applications of LLMs in AD systems, including exploring their optimization strategies in both modular and end-to-end approaches, with a particular focus on how LLMs can tackle the problems and challenges present in current solutions. Furthermore, we discuss an important question: Can LLM-based artificial general intelligence (AGI) be a key to achieve high-level AD? We further analyze the potential limitations and challenges that LLMs may encounter in promoting the development of AD technology.

Vision Language Models in Autonomous Driving and Intelligent Transportation Systems

Vision Language Models in Autonomous Driving: A Survey and Outlook

A Survey on Multimodal Large Language Models for Autonomous Driving

LLM4Drive: A Survey of Large Language Models for Autonomous Driving

Large Language Models for Autonomous Driving (LLM4AD): Concept, Benchmark, Simulation, and Real-Vehicle Experiment

An Introduction to Vision-Language Modeling

Exploring the Frontier of Vision-Language Models: A Survey of Current Methodologies and Future Directions

XLM for Autonomous Driving Systems: A Comprehensive Review

Large Language Models for Human-like Autonomous Driving: A Survey

V2X-VLM: End-to-End V2X Cooperative Autonomous Driving Through Large Vision-Language Models

Vision-Language Models for Vision Tasks: A Survey

Vision-Language Intelligence: Tasks, Representation Learning, and Large Models

SimpleLLM4AD: An End-to-End Vision-Language Model with Graph Visual Question Answering for Autonomous Driving

Empowering Autonomous Driving with Large Language Models: A Safety Perspective

A Survey on Large Language Model-empowered Autonomous Driving

Enabling Vision-and-Language Navigation for Intelligent Connected Vehicles Using Large Pre-Trained Models

DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models

MAPLM: A Real-World Large-Scale Vision-Language Benchmark for Map and Traffic Scene Understanding

Evaluation and Comparison of Visual Language Models for Transportation Engineering Problems

Semantic Understanding of Traffic Scenes with Large Vision Language Models