Adaptive Generative Adversarial Maximum Entropy Inverse Reinforcement Learning

Li Song,Dazi Li,Xin Xu
DOI: https://doi.org/10.1016/j.ins.2024.121712
IF: 8.1
2024-01-01
Information Sciences
Abstract:Maximum entropy inverse reinforcement learning algorithms have been extensively studied for learning rewards and optimizing policies using expert demonstrations. However, high-dimensional features and limited or non-optimal expert demonstrations can easily lead to overfitting, gradient vanishing, gradient exploding, and low convergence. To address these challenges, an adaptive generative adversarial maximum entropy inverse reinforcement learning algorithm is proposed, termed AGA-MEIRL. This algorithm can learn rewards and optimize policies with updated mixed expert demonstrations. The primary contribution is that the adaptive generative adversarial network (AdaGAN) helps potentially weak individual predictors aggregate into a strong composite predictor, thus solving mode collapse and overfitting problems of the discriminator in learning rewards. To solve the gradient vanishing problem, the activation function SELU is selected in AGA-MEIRL. Additionally, the gradient clipping method is introduced into AGA-MEIRL to tackle the gradient exploding problem, enhancing the algorithm’s stability and preventing data overflow. The convergence analysis of AGA-MEIRL is established based on the upper bound of the AdaGAN. Experimental results on the benchmark and rolling bearing fault diagnosis experiments demonstrate that AGA-MEIRL achieves superior rewards and success rates, effectively solving existing problems and outperforming current MEIRL approaches in learning rewards and policies from mixed expert demonstrations.
What problem does this paper attempt to address?