Imitation Learning by Reinforcement Learning

Kamil Ciosek
DOI: https://doi.org/10.48550/arXiv.2108.04763
2022-03-15
Abstract:Imitation learning algorithms learn a policy from demonstrations of expert behavior. We show that, for deterministic experts, imitation learning can be done by reduction to reinforcement learning with a stationary reward. Our theoretical analysis both certifies the recovery of expert reward and bounds the total variation distance between the expert and the imitation learner, showing a link to adversarial imitation learning. We conduct experiments which confirm that our reduction works well in practice for continuous control tasks.
Machine Learning
What problem does this paper attempt to address?