Guided Goal Generation for Hindsight Multi-Goal Reinforcement Learning

Chenjia Bai,Peng Liu,Wei Zhao,Xianglong Tang
DOI: https://doi.org/10.1016/j.neucom.2019.06.022
IF: 6
2019-01-01
Neurocomputing
Abstract:Typical reinforcement learning (RL) can only perform a single task and thus cannot scale to problems for which an agent needs to perform multiple tasks, such as moving objects to different locations, which is relevant to real-world environments. Hindsight experience replay (HER) based on universal value functions shows promising results in such multi-goal settings by substituting achieved goals for the original goal, frequently giving the agent rewards. However, the achieved goals are limited to the current policy level and lack guidance for learning. We propose a novel guided goal-generation model for multi-goal RL named G-HER. Our method uses a conditional generative recurrent neural network (RNN) to explicitly model the relationship between policy level and goals, enabling the generation of various goals conditions on the different policy levels. Goals generated with a higher policy level provide better guidance for the RL agent, which is equivalent to using knowledge of successful policy in advance to guide the learning of current policy. Our model accelerates the generalization of substitute goals to the whole goal space. The G-HER algorithm is evaluated on several robotic manipulating tasks and demonstrates improved performance and sample efficiency.
What problem does this paper attempt to address?