Abstract:Effective decision-making in autonomous driving relies on accurate inference of other traffic agents' future behaviors. To achieve this, we propose an online belief-update-based behavior prediction model and an efficient planner for Partially Observable Markov Decision Processes (POMDPs). We develop a Transformer-based prediction model, enhanced with a recurrent neural memory model, to dynamically update latent belief state and infer the intentions of other agents. The model can also integrate the ego vehicle's intentions to reflect closed-loop interactions among agents, and it learns from both offline data and online interactions. For planning, we employ a Monte-Carlo Tree Search (MCTS) planner with macro actions, which reduces computational complexity by searching over temporally extended action steps. Inside the MCTS planner, we use predicted long-term multi-modal trajectories to approximate future updates, which eliminates iterative belief updating and improves the running efficiency. Our approach also incorporates deep Q-learning (DQN) as a search prior, which significantly improves the performance of the MCTS planner. Experimental results from simulated environments validate the effectiveness of our proposed method. The online belief update model can significantly enhance the accuracy and temporal consistency of predictions, leading to improved decision-making performance. Employing DQN as a search prior in the MCTS planner considerably boosts its performance and outperforms an imitation learning-based prior. Additionally, we show that the MCTS planning with macro actions substantially outperforms the vanilla method in terms of performance and efficiency.

Influence-Augmented Online Planning for Complex Environments

Factored Online Planning in Many-Agent POMDPs

Scalable Planning and Learning for Multiagent POMDPs: Extended Version

Interactive POMDP Lite: Towards Practical Planning to Predict and Exploit Intentions for Interacting with Self-Interested Agents

Safe POMDP Online Planning among Dynamic Agents via Adaptive Conformal Prediction

Online Planning in POMDPs with State-Requests

Online Replanning in Belief Space for Partially Observable Task and Motion Problems

Game-theoretic Objective Space Planning

CAR-DESPOT: Causally-Informed Online POMDP Planning for Robots in Confounded Environments

Learning Online Belief Prediction for Efficient POMDP Planning in Autonomous Driving

Online POMDP Planning via Simplification

Intention-Aware Navigation in Crowds with Extended-Space POMDP Planning

Leveraging Statistical Multi-Agent Online Planning with Emergent Value Function Approximation

Hybrid Heuristic Online Planning for POMDPs

FHHOP: a Factored Hybrid Heuristic Online Planning Algorithm for Large POMDPs

Scalable Decision-Theoretic Planning in Open and Typed Multiagent Systems

Interactive Joint Planning for Autonomous Vehicles

Scaling Long-Horizon Online POMDP Planning via Rapid State Space Sampling

ReasonPlanner: Enhancing Autonomous Planning in Dynamic Environments with Temporal Knowledge Graphs and LLMs

Describe, Explain, Plan and Select: Interactive Planning with LLMs Enables Open-World Multi-Task Agents.

CAMPs: Learning Context-Specific Abstractions for Efficient Planning in Factored MDPs