Reinforcement actor-critic learning as a rehearsal in MicroRTS

Shiron Manandhar,Bikramjit Banerjee
DOI: https://doi.org/10.1017/s0269888924000092
2024-11-10
The Knowledge Engineering Review
Abstract:Real-time strategy (RTS) games have provided a fertile ground for AI research with notable recent successes based on deep reinforcement learning (RL). However, RL remains a data-hungry approach featuring a high sample complexity. In this paper, we focus on a sample complexity reduction technique called reinforcement learning as a rehearsal (RLaR) and on the RTS game of MicroRTS to formulate and evaluate it. RLaR has been formulated in the context of action-value function based RL before. Here, we formulate it for a different RL framework, called actor-critic RL. We show that on the one hand the actor-critic framework allows RLaR to be much simpler, but on the other hand, it leaves room for a key component of RLaR–a prediction function that relates a learner's observations with that of its opponent. This function, when leveraged for exploration, accelerates RL as our experiments in MicroRTS show. Further experiments provide evidence that RLaR may reduce actor noise compared to a variant that does not utilize RLaR's exploration. This study provides the first evaluation of RLaR's efficacy in a domain with a large strategy space.
computer science, artificial intelligence
What problem does this paper attempt to address?