LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs

Yunsheng Ma,Can Cui,Xu Cao,Wenqian Ye,Peiran Liu,Juanwu Lu,Amr Abdelraouf,Rohit Gupta,Kyungtae Han,Aniket Bera,James M. Rehg,Ziran Wang
2024-04-04
Abstract:Autonomous driving (AD) has made significant strides in recent years. However, existing frameworks struggle to interpret and execute spontaneous user instructions, such as "overtake the car ahead." Large Language Models (LLMs) have demonstrated impressive reasoning capabilities showing potential to bridge this gap. In this paper, we present LaMPilot, a novel framework that integrates LLMs into AD systems, enabling them to follow user instructions by generating code that leverages established functional primitives. We also introduce LaMPilot-Bench, the first benchmark dataset specifically designed to quantitatively evaluate the efficacy of language model programs in AD. Adopting the LaMPilot framework, we conduct extensive experiments to assess the performance of off-the-shelf LLMs on LaMPilot-Bench. Our results demonstrate the potential of LLMs in handling diverse driving scenarios and following user instructions in driving. To facilitate further research in this area, we release our code and data at <a class="link-external link-https" href="https://github.com/PurdueDigitalTwin/LaMPilot" rel="external noopener nofollow">this https URL</a>.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
This paper focuses on how to enable autonomous driving systems to understand and execute user instructions in natural language. Current autonomous driving frameworks face difficulties in handling such unstructured user instructions. The paper proposes a new framework called LaMPilot, which combines large language models (LLMs) to generate code that translates natural language instructions into executable driving plans using existing functional primitives. LaMPilot-Bench is a newly introduced benchmark dataset specifically designed to quantitatively evaluate the performance of LLMs in autonomous driving tasks. This dataset includes a series of tasks described in natural language and a simulated environment for comprehensive evaluation of agent strategy performance. Through the LaMPilot framework, researchers conducted extensive experiments with existing LLMs, and the results suggest the potential of LLMs in handling various driving scenarios and following driving instructions. Additionally, they proposed a baseline method based on human feedback, which integrates human guidance into the decision-making process of LLMs to improve their performance. Overall, this paper aims to address how autonomous driving systems can better understand and respond to user instructions in natural language. By integrating LLMs with traditional autonomous driving algorithms, the flexibility and interpretability of the system are improved.