Guiding real-world reinforcement learning for in-contact manipulation tasks with Shared Control Templates

Padalkar, Abhishek,Quere, Gabriel,Raffin, Antonin,Silvério, João,Stulp, Freek
DOI: https://doi.org/10.1007/s10514-024-10164-6
IF: 3.255
2024-06-05
Autonomous Robots
Abstract:The requirement for a high number of training episodes has been a major limiting factor for the application of Reinforcement Learning (RL) in robotics. Learning skills directly on real robots requires time, causes wear and tear and can lead to damage to the robot and environment due to unsafe exploratory actions. The success of learning skills in simulation and transferring them to real robots has also been limited by the gap between reality and simulation. This is particularly problematic for tasks involving contact with the environment as contact dynamics are hard to model and simulate. In this paper we propose a framework which leverages a shared control framework for modeling known constraints defined by object interactions and task geometry to reduce the state and action spaces and hence the overall dimensionality of the reinforcement learning problem. The unknown task knowledge and actions are learned by a reinforcement learning agent by conducting exploration in the constrained environment. Using a pouring task and grid-clamp placement task (similar to peg-in-hole) as use cases and a 7-DoF arm, we show that our approach can be used to learn directly on the real robot. The pouring task is learned in only 65 episodes (16 min) and the grid-clamp placement task is learned in 75 episodes (17 min) with strong safety guarantees and simple reward functions, greatly alleviating the need for simulation.
robotics,computer science, artificial intelligence
What problem does this paper attempt to address?
The paper attempts to address the issue of high training iterations required when applying Reinforcement Learning (RL) in practical robotic operations. Specifically: 1. **High Training Iterations**: Learning skills directly on a real robot requires a large number of training iterations, which is not only time-consuming but also leads to wear and tear or even damage to the robot. 2. **Reality-Simulation Gap**: Although learning skills in a simulation environment and transferring them to a real robot is one approach, the gap between reality and simulation, especially when it comes to contact dynamics in contact tasks, is difficult to model accurately, limiting the effectiveness of this approach. To solve these problems, the paper proposes a framework that uses Shared Control Templates (SCTs) to model known task constraints, thereby reducing the dimensionality of the state and action space, allowing reinforcement learning to be conducted more safely and efficiently on real robots. The specific methods include: - **Utilizing Known Constraints**: By modeling known constraints defined by object interactions and task geometry through SCTs, the dimensionality of the state and action space is reduced. - **Reinforcement Learning Agent Explores Unknown Parts**: Within these constrained environments, the reinforcement learning agent learns unknown task knowledge and actions through exploration. - **Simplifying Reward Function Design**: By explicitly modeling constraints, the design of the reward function is simplified, thereby accelerating the learning process and improving safety. The paper validates the effectiveness of this method through two specific tasks (a pouring task and a grid-based clamping placement task), demonstrating that the tasks can be successfully completed with a small number of training iterations and with strong safety guarantees.