An End-to-end Inverse Reinforcement Learning by a Boosting Approach with Relative Entropy

Tao Zhang,Ying Liu,Maxwell Hwang,Kao-Shing Hwang,ChunYan Ma,Jing Cheng
DOI: https://doi.org/10.1016/j.ins.2020.01.023
IF: 8.1
2020-01-01
Information Sciences
Abstract:Inverse reinforcement learning (IRL) involves imitating expert behaviors by recovering reward functions from demonstrations. This study proposes a model-free IRL algorithm to solve the dilemma of predicting the unknown reward function. The proposed end-to-end model comprises a dual structure of autoencoders in parallel. The model uses a state encoding method to reduce the computational complexity for high-dimensional environments and utilizes an Adaboost classifier to determine the difference between the predicted and demonstrated reward functions. Relative entropy is used as a metric to measure the difference between the demonstrated and the imitated behavior. The simulation experiments demonstrate the effectiveness of the proposed method in terms of the number of iterations that are required for the estimation.
What problem does this paper attempt to address?