Efficient Searching With MCTS and Imitation Learning: A Case Study in Pommerman

Hailan Yang,Shengze Li,Xinhai Xu,Xunyun Liu,Zhuxuan Meng,Yongjun Zhang
DOI: https://doi.org/10.1109/access.2021.3061313
IF: 3.9
2021-01-01
IEEE Access
Abstract:Pommerman is a popular reinforcement learning environment because it imposes several challenges such as sparse and deceptive rewards and delayed action effects. In this paper, we propose an efficient reinforcement learning approach that uses a more efficient Monte Carlo tree search combined with action pruning and flexible imitation learning to accelerate the search performance, allowing the agent to avoid meaningless explorations and find some high-level strategies. Under the Pommerman benchmark, we evaluate the agent driven by the proposed approach against the heuristic and pure reinforcement learning baselines, and the results show that our method can yield a relatively high-level agent performance during combat, which demonstrates the efficiency of our method in this specific domain and its potential ability.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?