Learning Robotic Skills Via Self-Imitation and Guide Reward

Chenyang Ran,Jianbo Su
DOI: https://doi.org/10.1109/smc52423.2021.9658945
2021-01-01
Abstract:Reinforcement learning (RL) has been extensively studied for robotic skill acquisition. Nevertheless, existing methods require extensive environmental interactions or high-quality demonstrations, which limits their application in practice. To alleviate this problem, a practical algorithm, named self-imitation learning with guide reward (SILGR), is proposed. The algorithm selects relatively good trajectories as expert data instead of external demonstrations and then assigns a guide reward to each transition. The criterion of the guide reward generator improves consistently with the evolution of the agent. In this way, the agent explores the environment in a task-relevant direction and exploits the experience more effectively, improving sample efficiency and performance. The results on four continuous locomotion tasks indicate that the proposed scheme achieves better performance than other state-of-the-art deep RL methods.
What problem does this paper attempt to address?