Abstract:With sufficient practice, humans can grab objects they have never seen before through brain decision-making. However, the manipulators, which has a wide range of applications in industrial production, can still only grab specific objects. Because most of the grasp algorithms rely on prior knowledge such as hand-eye calibration results, object model features, and can only target specific types of objects. When the task scenario and the operation target change, it cannot perform effective redeployment. In order to solve the above problems, academia often uses reinforcement learning to train grasping algorithms. However, the method of reinforcement learning in the field of manipulators grasping mainly encounters these main problems: insufficient sample utilization, poor algorithm stability, and limited exploration. This article uses LfD, BC, and DDPG to improve sample utilization. Use multiple critics to integrate and evaluate input actions to solve the problem of algorithm instability. Finally, inspired by Thompson's sampling idea, the input action is evaluated from different angles, which increases the algorithm's exploration of the environment and reduces the number of interactions with the environment. EDDPG and EBDDPG algorithm is designed in the article. In order to further improve the generalization ability of the algorithm, this article does not use extra information that is difficult to obtain directly on the physical platform, such as the real coordinates of the target object and the continuous motion space at the end of the manipulator in the Cartesian coordinate system is used as the output of the decision. The simulation results show that, under the same number of interactions, the manipulators' success rate in grabbing 1000 random objects has increased more than double and reached state-of-the-art(SOTA) performance.

Iterative Residual Policy: for Goal-Conditioned Dynamic Manipulation of Deformable Objects

Ensemble Bootstrapped Deep Deterministic Policy Gradient For Vision-Based Robotic Grasping

DeRi-IGP: Learning to Manipulate Rigid Objects Using Deformable Objects via Iterative Grasp-Pull

Rearranging Deformable Linear Objects for Implicit Goals with Self-Supervised Planning and Control

Residual Learning from Demonstration: Adapting DMPs for Contact-rich Manipulation

Learning to Manipulate Deformable Objects without Demonstrations

Learning to Design and Use Tools for Robotic Manipulation

Dynamic Manipulation of Deformable Objects using Imitation Learning with Adaptation to Hardware Constraints

Dynamic Manipulation of a Deformable Linear Object: Simulation and Learning

Efficient Robot Skill Learning with Imitation from a Single Video for Contact-Rich Fabric Manipulation

Differentiable Particles for General-Purpose Deformable Object Manipulation

Learning Predictive Representations for Deformable Objects Using Contrastive Estimation

DexDLO: Learning Goal-Conditioned Dexterous Policy for Dynamic Manipulation of Deformable Linear Objects

Learning Robotic Manipulation through Visual Planning and Acting

DeformPAM: Data-Efficient Learning for Long-horizon Deformable Object Manipulation via Preference-based Action Alignment

Sequential Dexterity: Chaining Dexterous Policies for Long-Horizon Manipulation

Learning Foresightful Dense Visual Affordance for Deformable Object Manipulation

Deformable Object Manipulation Using Human Demonstration Enhanced Deep Deterministic Policy Gradient

GenDOM: Generalizable One-shot Deformable Object Manipulation with Parameter-Aware Policy

Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning

Learning Extrinsic Dexterity with Parameterized Manipulation Primitives