Abstract:Robust and efficient grasping of different objects is still an open problem due to the difficulty of integrating multidisciplinary knowledge such as gripper ontology design, perception, control, and learning. In recent years, learning-based methods have achieved excellent results in grasping various novel objects. However, current methods are usually limited to a single grasping mode or rely on different end effectors to grasp objects of different shapes. For human beings, our hands are capable of grasping various objects with changes in grasping methods and form of hands. In light of this, developing a gripper with similar performance could possibly improve the robot's gripping ability. In this paper, we design a dual-modal soft gripper (DSG) and propose a deep reinforcement learning (DRL) framework to implement the operations. Both of our grasping modes, namely enveloping and pinching, are achieved through the tendon drive system and the deformation of the spring steel plate, which enables the gripper to switch between the two grasping modes in real time. We also combined the cutting-edge achievements of deep learning and reinforcement learning to design an autonomous grasping algorithm based on Q-learning and a deep Q network. Moreover, to fully utilize the visual input from the sensor, we added semantic embeddings of target objects to facilitate the learning, which is especially useful in deciding the grasping method for objects previously unseen. We also evaluate our DRL framework in different scenarios, offering a detailed comparison of each grasping mode and the mixed method (with or without semantic information). Our design has proved efficient in reducing the number of failing grasping actions and improving the success rate when facing novel and tricky objects.

Contrastively Learning Visual Attention as Affordance Cues from Demonstrations for Robotic Grasping

Learning Object Affordance with Contact and Grasp Generation

Affordance detection for task-specific grasping using deep learning

Learning Precise Affordances from Egocentric Videos for Robotic Manipulation

Attention Based Visual Analysis for Fast Grasp Planning with a Multi-Fingered Robotic Hand

DualAfford: Learning Collaborative Visual Affordance for Dual-gripper Manipulation

LAC-Net: Linear-Fusion Attention-Guided Convolutional Network for Accurate Robotic Grasping Under the Occlusion

Learning Visual Affordance Grounding from Demonstration Videos

Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations

Learning Fine Pinch-Grasp Skills using Tactile Sensing from A Few Real-world Demonstrations

A Deep Learning Approach to Grasping the Invisible

More Than a Feeling: Learning to Grasp and Regrasp using Vision and Touch

Learning Gentle Grasping from Human-Free Force Control Demonstration

Learning 6-DoF Task-oriented Grasp Detection via Implicit Estimation and Visual Affordance

Visual-Geometric Collaborative Guidance for Affordance Learning

RLAfford: End-to-End Affordance Learning for Robotic Manipulation

Learning Generalizable Dexterous Manipulation from Human Grasp Affordance

Learning Robotic Manipulation from Demonstrations by Combining Deep Generative Model and Dynamic Control System

A Joint Modeling of Vision-Language-Action for Target-oriented Grasping in Clutter

Implementation and Optimization of Grasping Learning with Dual-modal Soft Gripper.

Learning Visual Robotic Control Efficiently with Contrastive Pre-training and Data Augmentation