Abstract:With sufficient practice, humans can grab objects they have never seen before through brain decision-making. However, the manipulators, which has a wide range of applications in industrial production, can still only grab specific objects. Because most of the grasp algorithms rely on prior knowledge such as hand-eye calibration results, object model features, and can only target specific types of objects. When the task scenario and the operation target change, it cannot perform effective redeployment. In order to solve the above problems, academia often uses reinforcement learning to train grasping algorithms. However, the method of reinforcement learning in the field of manipulators grasping mainly encounters these main problems: insufficient sample utilization, poor algorithm stability, and limited exploration. This article uses LfD, BC, and DDPG to improve sample utilization. Use multiple critics to integrate and evaluate input actions to solve the problem of algorithm instability. Finally, inspired by Thompson's sampling idea, the input action is evaluated from different angles, which increases the algorithm's exploration of the environment and reduces the number of interactions with the environment. EDDPG and EBDDPG algorithm is designed in the article. In order to further improve the generalization ability of the algorithm, this article does not use extra information that is difficult to obtain directly on the physical platform, such as the real coordinates of the target object and the continuous motion space at the end of the manipulator in the Cartesian coordinate system is used as the output of the decision. The simulation results show that, under the same number of interactions, the manipulators' success rate in grabbing 1000 random objects has increased more than double and reached state-of-the-art(SOTA) performance.

Kernelized Gradient Descent Method for Learning from Demonstration.

COMBINATION OF AFFINE DEFORMATION AND DYNAMIC MOVEMENT PRIMITIVE IN LEARNING HUMAN MOTION FOR REDUNDANT MANIPULATOR

Generalize Robot Learning from Demonstration to Variant Scenarios with Evolutionary Policy Gradient

Auto-LfD: Towards Closing the Loop for Learning from Demonstrations

Trajectory Generation with Multi-Stage Cost Functions Learned from Demonstrations

A Variable Impedance Skill Learning Algorithm Based on Kernelized Movement Primitives

Learning from Successful and Failed Demonstrations via Optimization

Ensemble Bootstrapped Deep Deterministic Policy Gradient For Vision-Based Robotic Grasping

Learning from Demonstration for 7-DOF Anthropomorphic Manipulators Without Offset Via Analytical Inverse Kinematics

Robot Learning from Demonstration Using Elastic Maps

Demonstration Learning and Generalization of Robotic Motor Skills Based on Wearable Motion Tracking Sensors

Learning and generalization of task-parameterized skills through few human demonstrations

Recent Advances in Robot Learning from Demonstration

Diff-LfD: Contact-aware Model-based Learning from Visual Demonstration for Robotic Manipulation Via Differentiable Physics-based Simulation and Rendering.

A survey of robot learning from demonstration

Learning from Few Demonstrations with Frame-Weighted Motion Generation

Learning From Sparse Demonstrations

Conditional Neural Expert Processes for Learning Movement Primitives from Demonstration

Learning from Demonstration Framework for Multi-Robot Systems Using Interaction Keypoints and Soft Actor-Critic Methods

Fuzzy dynamical system for robot learning motion skills from human demonstration

Enabling Robots to Identify Missing Steps in Robot Tasks for Guided Learning from Demonstration