Abstract:Humans have a remarkable ability to make decisions by accurately reasoning about future events, including the future behaviors and states of mind of other agents. Consider driving a car through a busy intersection: it is necessary to reason about the physics of the vehicle, the intentions of other drivers, and their beliefs about your own intentions. If you signal a turn, another driver might yield to you, or if you enter the passing lane, another driver might decelerate to give you room to merge in front. Competent drivers must plan how they can safely react to a variety of potential future behaviors of other agents before they make their next move. This requires contingency planning: explicitly planning a set of conditional actions that depend on the stochastic outcome of future events. In this work, we develop a general-purpose contingency planner that is learned end-to-end using high-dimensional scene observations and low-dimensional behavioral observations. We use a conditional autoregressive flow model to create a compact contingency planning space, and show how this model can tractably learn contingencies from behavioral observations. We developed a closed-loop control benchmark of realistic multi-agent scenarios in a driving simulator (CARLA), on which we compare our method to various noncontingent methods that reason about multi-agent future behavior, including several state-of-the-art deep learning-based planning approaches. We illustrate that these noncontingent planning methods fundamentally fail on this benchmark, and find that our deep contingency planning method achieves significantly superior performance. Code to run our benchmark and reproduce our results is available at <a class="link-external link-https" href="https://sites.google.com/view/contingency-planning" rel="external noopener nofollow">this https URL</a>

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to carry out effective decision - making planning in a multi - agent environment, especially for the application of self - driving cars in complex traffic scenarios. Specifically, the paper focuses on how to enable the self - driving system to make safe and efficient driving decisions by accurately predicting future events (including the behaviors and states of other vehicles), just like human drivers. For example, when turning at a busy intersection, the self - driving system needs to consider the physical characteristics of its own vehicle, the intentions of other drivers and their understanding of their own behaviors. If the turn signal is given, other drivers may give way; if entering the overtaking lane, other drivers may slow down to leave space for themselves. The paper proposes a new method - Contingencies from Observations, aiming to develop a general - purpose contingency planner by learning high - dimensional scene observations and low - dimensional behavior observations. This method uses the conditional autoregressive flow model to create a compact contingency planning space and shows how to effectively learn contingencies from behavior observations. Different from traditional model predictive control (MPC), the contingency plan outputs a strategy that depends on future time steps and observation results, not just a sequence of future actions. This method can generate a plan that is adaptable to the future behaviors of other agents, thereby achieving safer and more efficient driving in a multi - agent environment. The paper also designs a closed - loop control benchmark test to evaluate the performance in realistic multi - agent scenarios in driving simulators (such as CARLA). The experimental results show that non - contingency planning methods have fundamental failures in handling these tasks, while the proposed deep contingency planning method significantly improves the performance.

Contingencies from Observations: Tractable Contingency Planning with Learned Behavior Models

Contingency Games for Multi-Agent Interaction

RACP: Risk-Aware Contingency Planning with Multi-Modal Predictions

Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings

Learning Hierarchical Behavior and Motion Planning for Autonomous Driving.

MARC: Multipolicy and Risk-aware Contingency Planning for Autonomous Driving

Conditional Predictive Behavior Planning with Inverse Reinforcement Learning for Human-like Autonomous Driving

Contingency-Aware Exploration in Reinforcement Learning

Behavior Planning at Urban Intersections through Hierarchical Reinforcement Learning

Tractable Joint Prediction and Planning over Discrete Behavior Modes for Urban Driving

Integrated Decision Making and Trajectory Planning for Autonomous Driving Under Multimodal Uncertainties: A Bayesian Game Approach

COPlanner: Plan to Roll Out Conservatively but to Explore Optimistically for Model-Based RL

Game-Theoretic Planning for Autonomous Driving among Risk-Aware Human Drivers

Planning with Adaptive World Models for Autonomous Driving

Deep Structured Reactive Planning

Interactive multi-modal motion planning with Branch Model Predictive Control

On Complementing End-to-end Human Behavior Predictors with Planning

Behavior Planning of Autonomous Cars with Social Perception.

A Game-Theoretic Framework for Joint Forecasting and Planning

Hybrid Recurrent Models Support Emergent Descriptions for Hierarchical Planning and Control

Behavior Planning of Autonomous Cars with Social Perception