Knowledge-guided Robot Learning on Compliance Control for Robotic Assembly Task with Predictive Model

Quan Liu,Zhenrui Ji,Wenjun Xu,Zhihao Liu,Bitao Yao,Zude Zhou
DOI: https://doi.org/10.1016/j.eswa.2023.121037
IF: 8.5
2023-01-01
Expert Systems with Applications
Abstract:Nowadays industrial robots have become the key equipment in the context of smart manufacturing and the assembly process is seen as one of the dominant fields of robotic applications. However, robotic assembly still greatly relies on manual programming and performs in a highly controlled and structured environment in a repetitive manner with weak generalization. Recent successes in robot learning show that endowing robot intelligence to obtain skills autonomously is a promising approach. The existing robot learning methods are difficult to apply due to the requirement of sufficient trial-and-error exploration which is hardware-cost and time-consuming. When encountering an unfamiliar task, it is natural for human to use their prior knowledge as guidance to derive the explorative action and then leaning the related skills from the accumulated experience. Inspired by that, this paper proposes a knowledge-guided robot learning method with predictive model to improve the safety and efficiency of assembly skills acquisition. Concretely, based on Cartesian compliance control, a knowledge-guided exploration strategy (KGES) using the fuzzy logic about position/force feedback is built to provide direction and limit the range of exploration in the early learning stages. Upon KGES, a predictive model-based reinforcement learning method is proposed to optimize the local searching trajectory, where the training data, generated from the trained ensemble predictive models with a knowledge-guided branched progressive rollout method, is used for policy optimization. Finally, the proposed method is tested in two peg-in-hole assembly tasks in MuJoCo environment, and the results show that the robot can learn the assembly skill faster and perform better in success rate than model-free and knowledge-free settings while maintaining the contact force within a safe range.
What problem does this paper attempt to address?