Robot Imitation Learning from Image-Only Observation Without Real-World Interaction

Xuanhui Xu,Mingyu You,Hongjun Zhou,Zhifeng Qian,Bin He
DOI: https://doi.org/10.1109/tmech.2022.3217048
2023-01-01
IEEE/ASME Transactions on Mechatronics
Abstract:Learning from observation (LfO) prompts the robot to imitate actions from experts' states via deep reinforcement learning (RL), achieving satisfactory results in simulation environments through hundreds of thousands of robot–environment interactions. While in the real world, constrained by the expensive and potentially dangerous interaction between the real robot and the environment, LfO is still challenging to be popularized. Therefore, reducing the number of interactions in LfO during training in the real world remains a hot research topic. Although significant progress has been made, the interaction is still inevitable. Hence, this article proposes the LION net ( Learning from Image-only Observation net ), which learns action from image-only demonstrations, e.g., a video of a human demonstrating a task, and reduces the number of the robot–environment interaction to zero in the real world. It is expected to be a realistic solution for real-world robot LfO. The LION net comprises two modules: 1) a domain transfer module that bridges the gap between the simulator and the real world and an RL-based control module that utilizes images as input to learning a task. The LION net affords the robot to imitate an action from the image-only human demonstration in the simulator and perform the learned action in the real world without additional training. The proposed method is evaluated on three real-life tasks: 1) pouring water, 2) washing cup, and 3) stacking cube, deployed to a real-world robot, demonstrating state-of-the-art performance.
What problem does this paper attempt to address?