Learning from Hindsight Demonstrations.

Mengxuan Shao,Feng Jiang,Shaohui Liu,Kun Han,Debin Zhao
DOI: https://doi.org/10.1007/978-981-99-1642-9_41
2023-01-01
Abstract:Learning from demonstrations (LfD) is an important technique to help reinforcement learning (RL) boost the training process, especially in the case of sparse rewards. But a major obstacle is the acquisition of expert demonstrations, which is difficult or expensive to obtain in many cases. In this paper, we propose a unique method called Learning from Hindsight Demonstrations (LfHD) to automatically produce hindsight demonstrations, on which LfD can be performed and the cost of acquiring expert demonstrations is avoided. The produced demonstrations are comparable to those of experts at certain success rate. We also improve the LfD method to make better use of the produced demonstrations. Experiments show that our method can greatly improve the training efficiency compared to existing algorithms.
What problem does this paper attempt to address?