Abstract:To solve the problem of lateral and logitudinal joint decision-making of multi-vehicle cooperative driving for connected and automated vehicles (CAVs), this paper proposes a Monte Carlo tree search (MCTS) method with parallel update for multi-agent Markov game with limited horizon and time discounted setting. By analyzing the parallel actions in the multi-vehicle joint action space in the partial-steady-state traffic flow, the parallel update method can quickly exclude potential dangerous actions, thereby increasing the search depth without sacrificing the search breadth. The proposed method is tested in a large number of randomly generated traffic flow. The experiment results show that the algorithm has good robustness and better performance than the SOTA reinforcement learning algorithms and heuristic methods. The vehicle driving strategy using the proposed algorithm shows rationality beyond human drivers, and has advantages in traffic efficiency and safety in the coordinating zone.

What problem does this paper attempt to address?

This paper aims to solve the problem of horizontal and vertical joint decision - making in multi - vehicle cooperative driving for connected and autonomous vehicles (CAVs). Specifically, the paper proposes a method based on Monte Carlo Tree Search (MCTS) with a parallel update function, which is suitable for multi - agent Markov games within a finite time - horizon and takes into account the time - discount setting. By analyzing parallel actions in the multi - vehicle joint action space in some steady - state traffic flows, the proposed parallel update method can quickly eliminate potentially dangerous actions, thus increasing the search depth without sacrificing the search width. This method was tested in a large number of randomly generated traffic flows, and the experimental results show that the algorithm has good robustness and its performance is better than existing reinforcement learning algorithms and heuristic methods. The vehicle driving strategy using this algorithm shows more rationality than human drivers and can improve traffic efficiency and safety in the coordination area. ### Main contributions of the paper: 1. **Value - based MCTS method**: Proposed a value - based MCTS method for two - dimensional joint decision - making in multi - vehicle cooperation. This algorithm shows strong environmental adaptability and can easily handle randomly generated traffic scenarios, and its performance exceeds existing state - of - the - art reinforcement learning algorithms and rule - based methods. 2. **Parallel extension of the standard tree update method**: Extended the standard tree update method of MCTS to a parallel form, effectively improving the search efficiency of the joint strategy in multi - agent systems. This method simultaneously increases the breadth and depth of the search under the same number of rollouts and is suitable for problems with similar steady - state transitions. 3. **Experimental verification**: Experiments were carried out in a large number of randomly generated scenarios, and the cooperative driving behaviors of CAVs were observed. This algorithm shows more rationality than typical human drivers and can optimize traffic conditions in a long - time - horizon. ### Method overview: - **Multi - agent Markov game modeling**: Model multi - vehicle cooperative driving as a multi - agent Markov game and define various components of the game, such as the state space, joint action space, state - transition probability distribution, reward function, etc. - **MCTS method**: The MCTS method includes four steps: selection, expansion, simulation, and back - propagation. The paper specifically introduces the parallel update method to accelerate the search process and improve search efficiency by identifying parallel actions. - **Reward function design**: The reward function aims to improve overall traffic efficiency and safety, including speed rewards, intention rewards, collision penalties, and lane - change frequency rewards. For some steady - state update systems, the paper proposes a specific reward function design to better capture the differences between actions. ### Experimental setup and results: - **Simulation environment**: Use the Flow framework to construct simulation scenarios, including two CAVs controlled by the MCTS algorithm and four human - driven vehicles (HDVs). Experimental parameters include the initial position, speed, and acceleration of vehicles. - **Experimental results**: In 200 experiments, this algorithm shows good robustness and performance, can effectively handle complex traffic scenarios, and improve traffic efficiency and safety. ### Conclusion: The method proposed in the paper shows significant advantages in multi - vehicle cooperative driving, especially in improving traffic efficiency and safety. Through the parallel update technology, the MCTS method has been significantly improved in search efficiency, providing strong support for the efficient cooperative driving strategies of CAVs.

A Value Based Parallel Update MCTS Method for Multi-Agent Cooperative Decision Making of Connected and Automated Vehicles

Characteristics of Mixed Traffic Flow in Two-lane Scenario Based on Cooperative Gaming Method

Balancing Computation Speed and Quality: A Decentralized Motion Planning Method for Cooperative Lane Changes of Connected and Automated Vehicles.

Decentralized Cooperative Planning for Automated Vehicles with Hierarchical Monte Carlo Tree Search

Distributed Cooperative Driving Strategy for Connected Automated Vehicles at Unsignalized Intersections Based on Monte Carlo Method

Accelerating Cooperative Planning for Automated Vehicles with Learned Heuristics and Monte Carlo Tree Search

An Auxiliary Decision-Making Method for Autonomous Driving via Monte Carlo Tree Search

Monte-Carlo Tree Search for Behavior Planning in Autonomous Driving

Cooperative Driving at Unsignalized Intersections Using Tree Search

Cooperative Decision-Making for CAVs at Unsignalized Intersections: A MARL Approach with Attention and Hierarchical Game Priors

Fast Multi-Class Vehicle Cooperative Path Optimization in Complex Urban V2X Transportation: A Novel Parallel Multi-Agent Reinforcement Learning Approach

A Universal Multi-Vehicle Cooperative Decision-Making Approach in Structured Roads by Mixed-Integer Potential Game

Multi-agent Path Finding for Cooperative Autonomous Driving

A Homogeneous Multi-Vehicle Cooperative Group Decision-Making Method in Complicated Mixed Traffic Scenarios

Multi-agent Path Finding for Mixed Autonomy Traffic Coordination

Routing optimization with Monte Carlo Tree Search-based multi-agent reinforcement learning

Applying Neural Monte Carlo Tree Search to Unsignalized Multi-intersection Scheduling for Autonomous Vehicles

TrafficMCTS: A Closed-Loop Traffic Flow Generation Framework with Group-Based Monte Carlo Tree Search

Cooperative Decision Making for Connected Automated Vehicles in Multiple Driving Scenarios

A Multi-Agent Reinforcement Learning Approach For Safe and Efficient Behavior Planning Of Connected Autonomous Vehicles

Structural Credit Assignment-Guided Coordinated MCTS: An Efficient and Scalable Method for Online Multiagent Planning