Abstract:With sufficient practice, humans can grab objects they have never seen before through brain decision-making. However, the manipulators, which has a wide range of applications in industrial production, can still only grab specific objects. Because most of the grasp algorithms rely on prior knowledge such as hand-eye calibration results, object model features, and can only target specific types of objects. When the task scenario and the operation target change, it cannot perform effective redeployment. In order to solve the above problems, academia often uses reinforcement learning to train grasping algorithms. However, the method of reinforcement learning in the field of manipulators grasping mainly encounters these main problems: insufficient sample utilization, poor algorithm stability, and limited exploration. This article uses LfD, BC, and DDPG to improve sample utilization. Use multiple critics to integrate and evaluate input actions to solve the problem of algorithm instability. Finally, inspired by Thompson's sampling idea, the input action is evaluated from different angles, which increases the algorithm's exploration of the environment and reduces the number of interactions with the environment. EDDPG and EBDDPG algorithm is designed in the article. In order to further improve the generalization ability of the algorithm, this article does not use extra information that is difficult to obtain directly on the physical platform, such as the real coordinates of the target object and the continuous motion space at the end of the manipulator in the Cartesian coordinate system is used as the output of the decision. The simulation results show that, under the same number of interactions, the manipulators' success rate in grabbing 1000 random objects has increased more than double and reached state-of-the-art(SOTA) performance.

DexPBT: Scaling up Dexterous Manipulation for Hand-Arm Systems with Population Based Training

DexRepNet: Learning Dexterous Robotic Grasping Network with Geometric and Spatial Hand-Object Representations

Ensemble Bootstrapped Deep Deterministic Policy Gradient For Vision-Based Robotic Grasping

DexDeform: Dexterous Deformable Object Manipulation with Human Demonstrations and Differentiable Physics

Deep Dynamics Models for Learning Dexterous Manipulation

DextrAH-G: Pixels-to-Action Dexterous Arm-Hand Grasping with Geometric Fabrics

Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning

Bi-DexHands: Towards Human-Level Bimanual Dexterous Manipulation

Solving Challenging Dexterous Manipulation Tasks With Trajectory Optimisation and Reinforcement Learning

DexVIP: Learning Dexterous Grasping with Human Hand Pose Priors from Video

DexTransfer: Real World Multi-fingered Dexterous Grasping with Minimal Human Demonstrations

Dexterous Manipulation with Deep Reinforcement Learning: Efficient, General, and Low-Cost

DEFT: Dexterous Fine-Tuning for Real-World Hand Policies

Physics-Based Dexterous Manipulations with Estimated Hand Poses and Residual Reinforcement Learning

DexH2R: Task-oriented Dexterous Manipulation from Human to Robots

DexPilot: Vision-Based Teleoperation of Dexterous Robotic Hand-Arm System.

Data-efficient Deep Reinforcement Learning for Dexterous Manipulation

Dexterous Functional Grasping

Learning Diverse and Physically Feasible Dexterous Grasps with Generative Model and Bilevel Optimization

Sampling-based Exploration for Reinforcement Learning of Dexterous Manipulation