Model-based Adversarial Imitation Learning from Demonstrations and Human Reward

Jie Huang,Jiangshan Hao,Rongshun Juan,Randy Gomez,Keisuke Nakamura,Guangliang Li
DOI: https://doi.org/10.1109/iros55552.2023.10341411
2023-01-01
Abstract:Reinforcement learning (RL) can potentially be applied to real-world robot control in complex and uncertain environments. However, it is difficult or even unpractical to design an efficient reward function for various tasks, especially those large and high-dimensional environments. Generative adversarial imitation learning (GAIL) - a general model-free imitation learning method, allows robots to directly learn policies from expert trajectories in large and high-dimensional environments. However, GAIL is still sample inefficient in terms of environmental interaction. In this paper, to solve this problem, we propose a model-based adversarial imitation learning from demonstrations and human reward (MAILDH), a novel model-based interactive imitation framework combining the advantages of GAIL, interactive RL and model-based RL. We tested our method in eight physics-based discrete and continuous control tasks for RL. Our results show that MAILDH can greatly improve the sample efficiency and robustness compared to the original GAIL.
What problem does this paper attempt to address?