SC-AIRL: Share-Critic in Adversarial Inverse Reinforcement Learning for Long-Horizon Task

Guangyu Xiang,Shaodong Li,Feng Shuang,Fang Gao,Xiaogang Yuan
DOI: https://doi.org/10.1109/lra.2024.3366023
IF: 5.2
2024-04-01
IEEE Robotics and Automation Letters
Abstract:Adversarial Inverse Reinforcement Learning (AIRL) has gained popularity as an alternative to supervised imitation learning, addressing the distributional bias issue of the latter. However, it still faces significant challenges in long-horizon tasks due to the lack of effective exploration. In our letter, we demonstrate that standard AIRL strategies end exploration prematurely during online reinforcement learning and fail to learn the entire task due to their inability to fully conform to the expert distribution, which is particularly detrimental to real-world robots. To address these challenges, we introduce the SC-AIRL approach. It decomposes long-horizon tasks into logical subtasks which reduces the agent's need for rich exploration. SC-AIRL utilizes expert demonstrations for performing multiple subtasks and shares a single critic and identical reward function across different subtask trainings. Additionally, we incorporate a human intervention mechanism during the subtask learning process to keep exploration from ending prematurely. Our experiments in challenging robot manipulation tasks demonstrate that SC-AIRL outperforms our baselines significantly. Furthermore, we conduct an exploratory experiment and an empirical analysis, emphasizing the potential of the model to manage complex tasks and the advantages of SC-AIRL over the baseline, respectively.
robotics
What problem does this paper attempt to address?