Abstract:Autonomous vehicle (AV) systems must be comprehensively tested and evaluated before they can be deployed. High-fidelity simulators such as CARLA or LGSVL allow this to be done safely in very realistic and highly customizable environments. Existing testing approaches, however, fail to test simulated AVs systematically, as they focus on specific scenarios and oracles (e.g., lane following scenario with the “no collision” requirement) and lack any coverage criteria measures. In this paper, we propose <inline-formula><tex-math notation="LaTeX">$\mathtt {AVUnit}$</tex-math><alternatives><mml:math><mml:mi mathvariant="monospace">AVUnit</mml:mi></mml:math><inline-graphic xlink:href="zhou-ieq1-3254142.gif"/></alternatives></inline-formula>, a framework for systematically testing AV systems against customizable correctness specifications. Designed modularly to support different simulators, <inline-formula><tex-math notation="LaTeX">$\mathtt {AVUnit}$</tex-math><alternatives><mml:math><mml:mi mathvariant="monospace">AVUnit</mml:mi></mml:math><inline-graphic xlink:href="zhou-ieq2-3254142.gif"/></alternatives></inline-formula> consists of two new languages for specifying dynamic properties of scenes (e.g., changing pedestrian behaviour after waypoints) and fine-grained assertions about the AV's journey. <inline-formula><tex-math notation="LaTeX">$\mathtt {AVUnit}$</tex-math><alternatives><mml:math><mml:mi mathvariant="monospace">AVUnit</mml:mi></mml:math><inline-graphic xlink:href="zhou-ieq3-3254142.gif"/></alternatives></inline-formula> further supports multiple fuzzing algorithms that automatically search for test cases that violate these assertions, using robustness and coverage measures as fitness metrics. We evaluated the implementation of <inline-formula><tex-math notation="LaTeX">$\mathtt {AVUnit}$</tex-math><alternatives><mml:math><mml:mi mathvariant="monospace">AVUnit</mml:mi></mml:math><inline-graphic xlink:href="zhou-ieq4-3254142.gif"/></alternatives></inline-formula> for the LGSVL+Apollo simulation environment, finding 19 kinds of issues in Apollo, which indicate that the open-source Apollo does not perform well in complex intersections and lane-changing related scenarios.

Evaluating the impact of flaky simulators on testing autonomous driving systems

Perception-Guided Fuzzing for Simulated Scenario-Based Testing of Autonomous Driving Systems

Choose Your Simulator Wisely: A Review on Open-source Simulators for Autonomous Driving

Model vs system level testing of autonomous driving systems: a replication and extension study

Simulation-based Adversarial Test Generation for Autonomous Vehicles with Machine Learning Components

Rigorous Simulation-based Testing for Autonomous Driving Systems -- Targeting the Achilles' Heel of Four Open Autopilots

Test Flakiness' Causes, Detection, Impact and Responses: A Multivocal Review

LGSVL Simulator: A High Fidelity Simulator for Autonomous Driving

ICSFuzz: Collision Detector Bug Discovery in Autonomous Driving Simulators

230,439 Test Failures Later: An Empirical Evaluation of Flaky Failure Classifiers

Automating Quantum Software Maintenance: Flakiness Detection and Root Cause Analysis

Reflections on Surrogate-Assisted Search-Based Testing: A Taxonomy and Two Replication Studies based on Industrial ADAS and Simulink Models

Two is Better Than One: Digital Siblings to Improve Autonomous Driving Testing

Reinforcement Learning for Online Testing of Autonomous Driving Systems: a Replication and Extension Study

SLAV-Sim: A Framework for Self-Learning Autonomous Vehicle Simulation

A Survey on Scenario-Based Testing for Automated Driving Systems in High-Fidelity Simulation

Augmented Driver Behavior Models for High-Fidelity Simulation Study of Crash Detection Algorithms

How does Simulation-based Testing for Self-driving Cars match Human Perception?

Specification-Based Autonomous Driving System Testing

A Sequential Metamorphic Testing Framework for Understanding Automated Driving Systems