Applying Online Expert Supervision in Deep Actor-Critic Reinforcement Learning.

Jin Zhang,Jiansheng Chen,Yiqing Huang,Weitao Wang,Tianpeng Li
DOI: https://doi.org/10.1007/978-3-030-03335-4_41
2018-01-01
Abstract:Deep reinforcement learning (DRL) has been showing its strong power in various decision making and controlling problems, e.g. Atari games and the game of Go. It is inspiring to see DRL agents to outperform even human masters. However, DRL algorithms require a large amount of calculation and exploration, making DRL agents hard to train, especially in problems with large state and action spaces. Also, most DRL algorithms are very sensitive to hyper parameters. To solve these problems, we propose A3COE, a new algorithm combining the A3C algorithm with online expert supervision. We applied it on mini-games of the famous real-time-strategy game StarCraft II. Results show that this algorithm greatly improved the agent’s performance with fewer training steps while acquiring more stable training processes with a greater range of hyper parameters. We also proved that this algorithm works even better with curriculum learning.
What problem does this paper attempt to address?