Multimodal Large Language Model Driven Scenario Testing for Autonomous Vehicles

Qiujing Lu,Xuanhan Wang,Yiwei Jiang,Guangming Zhao,Mingyue Ma,Shuo Feng
2024-09-10
Abstract:The generation of corner cases has become increasingly crucial for efficiently testing autonomous vehicles prior to road deployment. However, existing methods struggle to accommodate diverse testing requirements and often lack the ability to generalize to unseen situations, thereby reducing the convenience and usability of the generated scenarios. A method that facilitates easily controllable scenario generation for efficient autonomous vehicles (AV) testing with realistic and challenging situations is greatly needed. To address this, we proposed OmniTester: a multimodal Large Language Model (LLM) based framework that fully leverages the extensive world knowledge and reasoning capabilities of LLMs. OmniTester is designed to generate realistic and diverse scenarios within a simulation environment, offering a robust solution for testing and evaluating AVs. In addition to prompt engineering, we employ tools from Simulation of Urban Mobility to simplify the complexity of codes generated by LLMs. Furthermore, we incorporate Retrieval-Augmented Generation and a self-improvement mechanism to enhance the LLM's understanding of scenarios, thereby increasing its ability to produce more realistic scenes. In the experiments, we demonstrated the controllability and realism of our approaches in generating three types of challenging and complex scenarios. Additionally, we showcased its effectiveness in reconstructing new scenarios described in crash report, driven by the generalization capability of LLMs.
Robotics,Artificial Intelligence,Emerging Technologies
What problem does this paper attempt to address?
The problem this paper attempts to address is that in autonomous vehicle (AV) testing, existing scene generation methods struggle to meet diverse testing needs and lack generalization capabilities for unseen situations, resulting in generated scenes that are not realistic and challenging enough. Therefore, there is an urgent need for a method that can generate controllable, realistic, and challenging test scenes to improve the performance evaluation efficiency and safety of autonomous driving systems. Specifically, the paper proposes a multimodal large language model (LLM) framework named OmniTester, which aims to generate realistic and diverse test scenes by leveraging the extensive world knowledge and reasoning capabilities of LLMs. OmniTester can generate these scenes in a simulated environment, providing a powerful solution for testing and evaluating autonomous vehicles. Additionally, OmniTester employs techniques such as prompt engineering, retrieval-augmented generation (RAG), and self-improvement mechanisms to enhance the realism and diversity of the generated scenes. Through experiments, the paper demonstrates the controllability and realism of OmniTester in generating 3 types of complex scenes and validates its effectiveness in reconstructing new scenes based on accident reports.