Abstract:Accurately simulating diverse behaviors of heterogeneous agents in various scenarios is fundamental to autonomous driving simulation. This task is challenging due to the multi-modality of behavior distribution, the high-dimensionality of driving scenarios, distribution shift, and incomplete information. Our first insight is to leverage state-matching through differentiable simulation to provide meaningful learning signals and achieve efficient credit assignment for the policy. This is demonstrated by revealing the existence of gradient highways and interagent gradient pathways. However, the issues of gradient explosion and weak supervision in low-density regions are discovered. Our second insight is that these issues can be addressed by applying dual policy regularizations to narrow the function space. Further considering diversity, our third insight is that the behaviors of heterogeneous agents in the dataset can be effectively compressed as a series of prototype vectors for retrieval. These lead to our model-based reinforcement-imitation learning framework with temporally abstracted mixture-of-codebooks (MRIC). MRIC introduces the open-loop modelbased imitation learning regularization to stabilize training, and modelbased reinforcement learning (RL) regularization to inject domain knowledge. The RL regularization involves differentiable Minkowskidifference-based collision avoidance and projection-based on-road and traffic rule compliance rewards. A dynamic multiplier mechanism is further proposed to eliminate the interference from the regularizations while ensuring their effectiveness. Experimental results using the largescale Waymo open motion dataset show that MRIC outperforms state-ofthe-art baselines on diversity, behavioral realism, and distributional realism, with large margins on some key metrics (e.g., collision rate, minSADE, and time-to-collision JSD).

Enhance Generality by Model-based Reinforcement Learning and Domain Randomization

Learning-Based Hierarchical Model Predictive Control for Drift Vehicles

Incorporating Recurrent Reinforcement Learning into Model Predictive Control for Adaptive Control in Autonomous Driving

Driving Reinforcement Learning with Models

Learning Residual Model of Model Predictive Control via Random Forests for Autonomous Driving

VLM-MPC: Vision Language Foundation Model (VLM)-Guided Model Predictive Controller (MPC) for Autonomous Driving

Design and Implementation of Reinforcement Learning for Automated Driving Compared to Classical MPC Control

Uncertainty-aware hybrid paradigm of nonlinear MPC and model-based RL for offroad navigation: Exploration of transformers in the predictive model

Pay Attention to How You Drive: Safe and Adaptive Model-Based Reinforcement Learning for Off-Road Driving

MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning

Efficient and Generalized End-to-end Autonomous Driving System with Latent Deep Reinforcement Learning and Demonstrations

Experimental Validation of Safe MPC for Autonomous Driving in Uncertain Environments

DR-MPC: Deep Residual Model Predictive Control for Real-world Social Navigation

Towards Robust Decision-Making for Autonomous Highway Driving Based on Safe Reinforcement Learning

MRIC: Model-Based Reinforcement-Imitation Learning with Mixture-of-Codebooks for Autonomous Driving Simulation

Interaction-aware Model Predictive Control for Autonomous Driving

Robust Driving Policy Learning with Guided Meta Reinforcement Learning

Research on Autonomous Driving Decision-making Strategies based Deep Reinforcement Learning

Model-free Deep Reinforcement Learning for Urban Autonomous Driving

A Combined Reinforcement Learning and Model Predictive Control for Car-Following Maneuver of Autonomous Vehicles

Improving safety in mixed traffic: A learning-based model predictive control for autonomous and human-driven vehicle platooning