Abstract:Monte Carlo tree search (MCTS) has been gaining increasing popularity, and the success of AlphaGo has prompted a new trend of incorporating a value network and a policy network constructed with neural networks into MCTS, namely, NN-MCTS. In this work, motivated by the shortcomings of the widely used upper confidence bounds applied to trees (UCT) policy, we formulate the node selection problem in NN-MCTS as a multistage ranking and selection (R&S) problem and propose a node selection policy that efficiently allocates a limited search budget to maximize the probability of correctly selecting the best action at the root state. The value and policy networks in NN-MCTS further improve the performance of the proposed node selection policy by providing prior knowledge and guiding the selection of the final action, respectively. Numerical experiments on two board games and an OpenAI task demonstrate that the proposed method outperforms the UCT policy used in AlphaGo Zero and MuZero, implying the potential of constructing node selection policies in NN-MCTS with R&S procedures. History: Accepted by Bruno Tuffin, Area Editor for Simulation. Funding: This work was supported by the National Natural Science Foundation of China [Grants 72325007, 72250065, and 72022001], and a PKU-Boya Postdoctoral Fellowship 2406396158. Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information ( https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2023.0307 ) as well as from the IJOC GitHub software repository ( https://github.com/INFORMSJoC/2023.0307 ). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/ .

An Optimal Computing Budget Allocation Tree Policy for Monte Carlo Tree Search

An Efficient Dynamic Sampling Policy for Monte Carlo Tree Search.

An Efficient Node Selection Policy for Monte Carlo Tree Search with Neural Networks

Doing Better Than UCT: Rational Monte Carlo Sampling in Trees

Monte-Carlo tree search with uncertainty propagation via optimal transport

An Analysis on the Effects of Evolving the Monte Carlo Tree Search Upper Confidence for Trees Selection Policy on Unimodal, Multimodal and Deceptive Landscapes

Monte Carlo Tree Search with Boltzmann Exploration

Optimized Monte Carlo Tree Search for Enhanced Decision Making in the FrozenLake Environment

Monte Carlo Search Algorithms Discovering Monte Carlo Tree Search Exploration Terms

Watch the Unobserved: A Simple Approach to Parallelizing Monte Carlo Tree Search

An Efficient Node Selection Policy for Value Network Based Monte Carlo Tree Search

Monte Carlo Tree Search for Policy Optimization.

Efficient Monte Carlo Tree Search via On-the-Fly State-Conditioned Action Abstraction

Solving Stochastic Orienteering Problems with Chance Constraints Using Monte Carlo Tree Search

Monte-Carlo Tree Search with Epsilon-Greedy for Game of Amazons

Monte Carlo Tree Descent for Black-Box Optimization

Generalized Mean Estimation in Monte-Carlo Tree Search

Learning decision trees through Monte Carlo tree search: An empirical evaluation

Exploring search space trees using an adapted version of Monte Carlo tree search for combinatorial optimization problems

Monte Carlo Tree Search in the Presence of Transition Uncertainty

Bayesian Optimized Monte Carlo Planning