Abstract:Realistic and interactive scene simulation is a key prerequisite for autonomous vehicle (AV) development. In this work, we present SceneDiffuser, a scene-level diffusion prior designed for traffic simulation. It offers a unified framework that addresses two key stages of simulation: scene initialization, which involves generating initial traffic layouts, and scene rollout, which encompasses the closed-loop simulation of agent behaviors. While diffusion models have been proven effective in learning realistic and multimodal agent distributions, several challenges remain, including controllability, maintaining realism in closed-loop simulations, and ensuring inference efficiency. To address these issues, we introduce amortized diffusion for simulation. This novel diffusion denoising paradigm amortizes the computational cost of denoising over future simulation steps, significantly reducing the cost per rollout step (16x less inference steps) while also mitigating closed-loop errors. We further enhance controllability through the introduction of generalized hard constraints, a simple yet effective inference-time constraint mechanism, as well as language-based constrained scene generation via few-shot prompting of a large language model (LLM). Our investigations into model scaling reveal that increased computational resources significantly improve overall simulation realism. We demonstrate the effectiveness of our approach on the Waymo Open Sim Agents Challenge, achieving top open-loop performance and the best closed-loop performance among diffusion models.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: in the development of autonomous vehicles (AVs), how to create realistic and interactive scenario simulations. Specifically, the paper proposes a model named SceneDiffuser, aiming to solve two key stages in traffic simulation through the diffusion model: scenario initialization and scenario expansion. ### 1. Scenario Initialization - **Problem**: Generate the initial traffic layout. - **Challenge**: It is required to generate diverse and realistic initial scenarios and be able to perform controllable editing on these scenarios. ### 2. Scenario Expansion - **Problem**: Simulate agent behaviors in closed - loop simulations. - **Challenge**: - Maintain the realism in closed - loop simulations. - Ensure inference efficiency and reduce the cost of each step of inference. - Avoid distribution drift due to cumulative errors. To solve these problems, the paper introduces the following innovations: #### 1. Amortized Diffusion Amortized diffusion is a new diffusion denoising paradigm. It amortizes the denoising calculation cost into future simulation steps, thereby significantly reducing the number of inferences in each expansion step (by 16 times) and alleviating closed - loop errors. #### 2. Controllability Enhancement - **Generalized Hard Constraints (GHC)**: A simple and effective constraint mechanism at the time of inference, ensuring that the generated scenarios meet specific conditions. - **Language - based Constraint Generation**: Through a small number of sample prompts to large - language models (LLMs), realize the conversion from natural language to scenario - generation constraints. #### 3. Model Expansion Research Research shows that increasing computational resources can significantly improve the overall sense of realism in simulations. ### Experimental Results The paper demonstrated the effectiveness of its method in the Waymo Open Sim Agents Challenge, achieving the best open - loop performance and the best closed - loop performance among diffusion models. In summary, SceneDiffuser solves the problems of scenario initialization and expansion through a unified framework, improves the realism and efficiency of simulations, and provides better controllability.

SceneDiffuser: Efficient and Controllable Driving Simulation Initialization and Rollout

Learning to Simulate Complex Scenes for Street Scene Segmentation

SAFE-SIM: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries

Scenario Diffusion: Controllable Driving Scenario Generation With Diffusion

Generating Driving Scenes with Diffusion

AdvDiffuser: Generating Adversarial Safety-Critical Driving Scenarios via Guided Diffusion

DiffRoad: Realistic and Diverse Road Scenario Generation for Autonomous Vehicle Testing

Versatile Behavior Diffusion for Generalized Traffic Agent Simulation

SceneDM: Scene-level Multi-agent Trajectory Generation with Consistent Diffusion Models

Diffusion-based Generation, Optimization, and Planning in 3D Scenes

DiffusionDrive: Truncated Diffusion Model for End-to-End Autonomous Driving

Editable Scene Simulation for Autonomous Driving via Collaborative LLM-Agents

Boosting Visual Fidelity in Driving Simulations through Diffusion Models

Mixed Diffusion for 3D Indoor Scene Synthesis

Data-driven Diffusion Models for Enhancing Safety in Autonomous Vehicle Traffic Simulations

DiffSF: Diffusion Models for Scene Flow Estimation

SynDiff-AD: Improving Semantic Segmentation and End-to-End Autonomous Driving with Synthetic Data from Latent Diffusion Models

Injection Simulation: an Efficient Validation Framework for Autonomous Driving System

DiffuScene: Denoising Diffusion Models for Generative Indoor Scene Synthesis

Enhanced Multimodal Trajectory Prediction for Autonomous Vehicles Using Advanced Diffusion Model Techniques

A Diffusion-Model of Joint Interactive Navigation