Learning to assign credit in reinforcement learning by incorporating abstract relations

Dong Yan,Shiyu Huang,Hang Su,Jun Zhu
2019-01-01
Abstract:Credit assignment is one of the most critical problems in reinforcement learning to discover which actions are responsible for rewards. It becomes more serious as reinforcement learning is applied to real-world scenarios where the decision process may involve thousands of actions. In this paper, we propose a novel framework that utilizes a computation process to assign credits for hundreds of thousands of nonterminal state-action pairs, in order to accelerate the learning speed. Specifically, we first abstract the states and actions of the original problem into a compact representation, which reduces the problem to a tractable size. Then, we solve the abstracted problem to obtain the optimal value function, which is the expected returns of future rewards. Finally, we use the derived value function to assign credits for state-action pairs of the original problem. We conduct extensive experiments on Doom, a complex 3D video game in which the reward signal is sparse. The experiment results demonstrate that our agent outperforms previous state-of-the-art agents in terms of both kill count and death number with a large margin. The effectiveness also manifests in an online competition of Doom, in which we achieved the 2nd place in the final.
What problem does this paper attempt to address?