Simultaneously learning intentions and preferences during physical human-robot cooperation

van der Spaa, Linda,Kober, Jens
DOI: https://doi.org/10.1007/s10514-024-10167-3
IF: 3.255
2024-06-05
Autonomous Robots
Abstract:The advent of collaborative robots allows humans and robots to cooperate in a direct and physical way. While this leads to amazing new opportunities to create novel robotics applications, it is challenging to make the collaboration intuitive for the human. From a system's perspective, understanding the human intentions seems to be one promising way to get there. However, human behavior exhibits large variations between individuals, such as for instance preferences or physical abilities. This paper presents a novel concept for simultaneously learning a model of the human intentions and preferences incrementally during collaboration with a robot. Starting out with a nominal model, the system acquires collaborative skills step-by-step within only very few trials. The concept is based on a combination of model-based reinforcement learning and inverse reinforcement learning, adapted to fit collaborations in which human and robot think and act independently. We test the method and compare it to two baselines: one that imitates the human and one that uses plain maximum entropy inverse reinforcement learning, both in simulation and in a user study with a Franka Emika Panda robot arm.
robotics,computer science, artificial intelligence
What problem does this paper attempt to address?
The paper aims to address the issue of how robots can learn to understand human intentions and preferences during physical collaboration with humans. Specifically, the researchers propose a new method that enables robots to gradually learn and improve their understanding of human intentions and preferences during the collaboration process. The main objectives include: 1. **Cooperative Intention Perception**: Enable robots to understand and predict the intentions of human partners, thereby achieving smoother and more intuitive cooperation. 2. **Personalized Preference Model**: Learn and update a personalized preference model through actual cooperation with human partners. 3. **Dual-Layer Theory of Mind Reasoning**: Use a two-layer Theory of Mind model to distinguish between the behaviors of robots and humans, achieving better optimization of cooperative behaviors. 4. **Rapid Learning**: Combine Reinforcement Learning and Inverse Reinforcement Learning to enable robots to learn cooperative strategies in very few experimental rounds. The core of the research is to develop a method that can acquire information from the cooperation process and use this information to continuously optimize the robot's responses. This method not only needs to handle the robot's own decision-making but also understand the human partner's behavioral motivations and their changes. In this way, robots can better adapt to the behavior patterns of human partners in actual cooperative tasks, improving cooperation efficiency.