A stable method for task priority adaptation in quadratic programming via reinforcement learning

Andrea Testa,Marco Laghi,Edoardo Del Bianco,Gennaro Raiola,Enrico Mingo Hoffman,Arash Ajoudani
DOI: https://doi.org/10.1016/j.rcim.2024.102857
IF: 10.103
2024-09-03
Robotics and Computer-Integrated Manufacturing
Abstract:In emerging manufacturing facilities, robots must enhance their flexibility. They are expected to perform complex jobs, showing different behaviors on the need, all within unstructured environments, and without requiring reprogramming or setup adjustments. To address this challenge, we introduce the A3CQP, a non-strict hierarchical Quadratic Programming (QP) controller. It seamlessly combines both motion and interaction functionalities, with priorities dynamically and autonomously adapted through a Reinforcement Learning-based adaptation module. This module utilizes the Asynchronous Advantage Actor–Critic algorithm (A3C) to ensure rapid convergence and stable training within continuous action and observation spaces. The experimental validation, involving a collaborative peg-in-hole assembly and the polishing of a wooden plate, demonstrates the effectiveness of the proposed solution in terms of its automatic adaptability, responsiveness, flexibility, and safety.
robotics,computer science, interdisciplinary applications,engineering, manufacturing
What problem does this paper attempt to address?