AdaBoost Maximum Entropy Deep Inverse Reinforcement Learning with Truncated Gradient

Li Song,Dazi Li,Xiao Wang,Xin Xu
DOI: https://doi.org/10.1016/j.ins.2022.04.017
IF: 8.1
2022-04-01
Information Sciences
Abstract:Studying the representational capacity of neural networks to learn nonlinear rewards is necessary in a complex and nonlinear environment. Over recent years, the maximum entropy deep inverse reinforcement learning algorithm (ME-DIRL) has been increasingly applied to the learning of nonlinear rewards. However, under cases of limited and imbalanced expert demonstration data, complex calculations, or overfitting, the learning nonlinear rewards remains a challenging problem. A novel ME-DIRL with AdaBoost algorithm (AME-DIRL) is our proposed solution. The focus of AME-DIRL is to utilize the AdaBoost algorithm. This combines multiple ME-DIRL processes to form a strong learner and thus overcome the imbalance of the data set. Furthermore, to deal with the complex calculations in AME-DIRL, a truncated gradient (TG) method is applied for getting the sparse rewards obtained by the strong learner, thus reducing the model complexity. To prevent overfitting, a correction factor is then added to the linear combination of weakF learners. AME-DIRL models the relationship between input features and output rewards. Rewards are approximated by means of a convolutional neural network (CNN) with scaled exponential linear units (SELUs). Numerical results indicate that our proposed AME-DIRL shows higher accuracy in learning rewards when compared with several classical inverse reinforcement learning (IRL) algorithms.
computer science, information systems
What problem does this paper attempt to address?