OB-HPPO: an Option and Intrinsic Curiosity Based Hierarchical Reinforcement Learning Approach for Real-Time Strategy Games

Ruilin Jiang,Yanlong Zhai,Yan Zheng,You Li,Yanglin Liu
DOI: https://doi.org/10.1007/978-981-97-5581-3_36
2024-01-01
Abstract:The multi-agent real-time strategy game problem is a classic problem in the field of reinforcement learning, and solving such a problem is of high instructive significance to the economic and military fields in real society. In recent years, researchers from many countries have made breakthroughs in the related problems, but most related technologies target specific environments or require high computing power platforms. This leads to an exponential increase in the time and resources consumed in training models when the complexity and scope of a task increases. In this paper, we proposed OB-HPPO, an option and intrinsic curiosity based hierarchical reinforcement learning framework to address these challenges. Our approach hierarchically decomposes a huge action space into several self-explainable options, simplifying atomic action decisions into a series of action decisions. OB-HPPO also introduces an intrinsic curiosity module (ICM) based on the Proximal Policy Optimization (PPO) algorithm to improve the efficiency of model training and exploration. Experimental results show that OB-HPPO takes less training time and accumulates more rewards than non-hierarchical models. We also test OB-HPPO against some representative AI models of the mu RTS environment, and OB-HPPO's winning rate is significantly improved.
What problem does this paper attempt to address?