Goal-conditioned Behavioral Cloning with Prioritized Sampling

Fei Ma,Guanjun Liu,Kaiwen Zhang
DOI: https://doi.org/10.1109/ICNSC52481.2021.9702233
2021-01-01
Abstract:Imitation learning is a promising approach to extract knowledge from human’s demonstrations. In traditional imitation learning methods like behavioral cloning, human demonstration transitions were uniformly sampled from replay buffer regardless of their different values. In addition, agent trained by this method is limited to solve a specific task. In this paper, we extend traditional behavioral cloning method with a prioritized sampling technique. To make our method more general, we introduce an additional element goal which is the difference between last state of demonstration trajectory and current state to original framework. Our method is simple to implement with one assumption that small amount of expert demonstration is available. We show that our method outperforms common reinforcement learning algorithms on some image-based and hierarchical tasks in Minecraft environment. Further, we illustrate that it has potential to perform better than human level policy when combining our method with reinforcement learning algorithms.
What problem does this paper attempt to address?