Abstract:Reinforcement learning (RL) has been increasingly used for single peg-in-hole assembly, where assembly skill is learned through interaction with the assembly environment in a manner similar to skills employed by human beings. However, the existing RL algorithms are difficult to apply to the multiple peg-in-hole assembly because the much more complicated assembly environment requires sufficient exploration, resulting in a long training time and less data efficiency. To this end, this article focuses on how to predict the assembly environment and how to use the predicted environment in assembly action control to improve the data efficiency of the RL algorithm. Specifically, first, the assembly environment is exactly predicted by a variable time-scale prediction (VTSP) defined as general value functions (GVFs), reducing the unnecessary exploration. Second, we propose a fuzzy logic-driven variable time-scale prediction-based reinforcement learning (FLDVTSP-RL) for assembly action control to improve the efficiency of the RL algorithm, in which the predicted environment is mapped to the impedance parameter in the proposed impedance action space by a fuzzy logic system (FLS) as the action baseline. To demonstrate the effectiveness of VTSP and the data efficiency of the FLDVTSP-RL methods, a dual peg-in-hole assembly experiment is set up; the results show that FLDVTSP-deep Q-learning (DQN) decreases the assembly time about 44% compared with DQN and FLDVTSP-deep deterministic policy gradient (DDPG) decreases the assembly time about 24% compared with DDPG. Note to Practitioners-The complicated assembly environment of the multiple peg-in-hole assembly results in a contact state that cannot be recognized exactly from the force sensor. Therefore, contact-model-based methods that require tuning of the control parameters based on the contact state recognition cannot be applied directly in this complicated environment. Recently, reinforcement learning (RL) methods without contact state recognition have recently attracted scientific interest. However, the existing RL methods still rely on numerous explorations and a long training time, which cannot be directly applied to real-world tasks. This article takes inspiration from the manner in which human beings can learn assembly skills with a few trials, which relies on the variable time-scale predictions (VTSPs) of the environment and the optimized assembly action control strategy. Our proposed fuzzy logic-driven variable time-scale prediction-based reinforcement learning (FLDVTSP-RL) can be implemented in two steps. First, the assembly environment is predicted by the VTSP defined as general value functions (GVFs). Second, assembly action control is realized in an impedance action space with a baseline defined by the impedance parameter mapped from the predicted environment by the fuzzy logic system (FLS). Finally, a dual peg-in-hole assembly experiment is conducted; compared with deep Q-learning (DQN), FLDVTSP-DQN can decrease the assembly time about 44%; compared with deep deterministic policy gradient (DDPG), FLDVTSP-DDPG can decrease the assembly time about 24%.

Deep Reinforcement Learning for Peg-in-hole Assembly Task Via Information Utilization Method

Learning-based Optimization Algorithms Combining Force Control Strategies for Peg-in-Hole Assembly.

Perception of Demonstration for Automatic Programing of Robotic Assembly: Framework, Algorithm, and Validation

Active compliance control of robot peg-in-hole assembly based on combined reinforcement learning

Deep Visual-guided and Deep Reinforcement Learning Algorithm Based for Multip-Peg-in-Hole Assembly Task of Power Distribution Live-line Operation Robot

Multiple peg-in-hole compliant assembly based on a learning-accelerated deep deterministic policy gradient strategy

Sample-Efficiency, Stability and Generalization Analysis for Deep Reinforcement Learning on Robotic Peg-in-Hole Assembly

Feedback Deep Deterministic Policy Gradient With Fuzzy Reward for Robotic Multiple Peg-in-Hole Assembly Tasks

Imitation Learning Study for Robotic Peg-in-hole Assembly

Variable Compliance Control for Robotic Peg-in-Hole Assembly: A Deep Reinforcement Learning Approach

Fuzzy Logic-Driven Variable Time-Scale Prediction-Based Reinforcement Learning for Robotic Multiple Peg-in-Hole Assembly

Robot autonomous grasping and assembly skill learning based on deep reinforcement learning

Knowledge-Driven Deep Deterministic Policy Gradient for Robotic Multiple Peg-in-Hole Assembly Tasks

Demonstration Guided Actor-Critic Deep Reinforcement Learning for Fast Teaching of Robots in Dynamic Environments

Local connection reinforcement learning method for efficient robotic peg-in-hole assembly

Deep Siamese Neural Network-Driven Model for Robotic Multiple Peg-in-Hole Assembly System

Geometric-Feature Representation Based Pre-Training Method for Reinforcement Learning of Peg-in-Hole Tasks

Peg-in-hole Assembly Skill Imitation Learning Method Based on ProMPs under Task Geometric Representation.

Robotic peg-in-hole assembly based on reversible dynamic movement primitives and trajectory optimization

Extended residual learning with one-shot imitation learning for robotic assembly in semi-structured environment

Mastering Autonomous Assembly in Fusion Application with Learning-by-doing: a Peg-in-hole Study