Abstract:There is considerable interest in applying reinforcement learning (RL) to improve machine control across multiple industries, and the automotive industry is one of the prime examples. Monte Carlo Tree Search (MCTS) has emerged and proven powerful in decision-making games, even without understanding the rules. In this study, multibody system dynamics (MSD) control is first modeled as a Markov Decision Process and solved with Monte Carlo Tree Search. Based on randomized search space exploration, the MCTS framework builds a selective search tree by repeatedly applying a Monte Carlo rollout at each child node. However, without a library of available choices, deciding among the many possibilities for agent parameters can be intimidating. In addition, the MCTS poses a significant challenge for searching due to the large branching factor. This challenge is typically overcome by appropriate parameter design, search guiding, action reduction, parallelization, and early termination. To address these shortcomings, the overarching goal of this study is to provide needed insight into inverted pendulum controls via vanilla and modified MCTS agents, respectively. A series of reward functions are well-designed according to the control goal, which maps a specific distribution shape of reward bonus and guides the MCTS-based control to maintain the upright position. Numerical examples show that the reward-modified MCTS algorithms significantly improve the control performance and robustness of the default choice of a constant reward that constitutes the vanilla MCTS. The exponentially decaying reward functions perform better than the constant value or polynomial reward functions. Moreover, the exploitation vs. exploration trade-off and discount parameters are carefully tested. The study's results can guide the research of RL-based MSD users.

SF-MCTS: Score Feedback Monte Carlo Tree Search for Digital Curling in Continuous State Space

Towards High Level Skill Learning: Learn to Return Table Tennis Ball Using Monte-Carlo Based Policy Gradient Method.

A game strategy model in the digital curling system based on NFSP

Mastering Curling with RL-revised Decision Tree

Belief-state Monte-Carlo Tree Search for Phantom Games

Beyond Monte Carlo Tree Search: Playing Go with Deep Alternative Neural Network and Long-Term Evaluation

Deep Learning for Reward Design to Improve Monte Carlo Tree Search in ATARI Games

An Efficient Dynamic Sampling Policy for Monte Carlo Tree Search.

Optimized Monte Carlo Tree Search for Enhanced Decision Making in the FrozenLake Environment

Continuous Monte Carlo Graph Search

A Self-Learning Monte Carlo Tree Search Algorithm for Robot Path Planning.

Playing Carcassonne with Monte Carlo Tree Search

Dual Monte Carlo Tree Search

Monte Carlo tree search control scheme for multibody dynamics applications

Multiple Policy Value Monte Carlo Tree Search

Monte Carlo Tree Search: a review of recent modifications and applications

Development and Application of a Monte Carlo Tree Search Algorithm for Simulating Da Vinci Code Game Strategies

Elastic Monte Carlo Tree Search with State Abstraction for Strategy Game Playing

Development of Rehabilitation System (ReHabgame) through Monte-Carlo Tree Search Algorithm

Creating Adjustable Human-like AI Behavior in a 3D Tennis Game with Monte-Carlo Tree Search