Abstract:Grasping is critical for intelligent robots to accomplish sophisticated tasks. Even with multimodal sensor fusion, accurately and reliably estimating grasp poses for complex-shaped objects remains a challenge. In this paper, we design a vision-based grasping platform for a more general case, that is, grasping a variety of objects by a simple parallel gripper with the grasp detection model consuming RGB sensing or depth sensing. Focusing on the grasp pose estimation part, we propose a deep grasp detector that uses a densely connected Feature Pyramid Network (FPN) feature extractor and multiple two-stage detection units to achieve dense grasp pose predictions. Specifically, for the feature extractor, the fusion of different layer feature maps can increase both the model's capacity to detect the various size grasp areas and the accuracy of the regressed grasp positions. For each of the two-stage detection unit, the first stage generates horizontal candidate grasp areas, while the second stage refines them to predict the rotated grasp poses. We train and validate our grasp pose estimation algorithm on the Cornell Grasp Dataset and the Jacquard Dataset. The model achieves the detection accuracy of 93.3% and 89.6%, respectively. We further design real-world grasp experiments to verify the effectiveness of our vision-based robotic grasping system. The real scenario trials validate that the system is capable of grasping unseen objects, in particular, achieving robust and accurate grasp pose detection and gripper opening width measurement based on depth sensing only.

A Robot Grasp Relationship Detection Network Based on the Fusion of Multiple Features

Efficient Grasp Detection Network with Gaussian-Based Grasp Representation for Robotic Manipulation

A New Robotic Grasp Detection Method Based on RGB-D Deep Fusion.

Robust Robot Grasp Detection in Multimodal Fusion

Robotic Grasp Detection Method Based on Lightweight Feature Fusion Convolutional Neural Network

Real-Time Robotic Grasp Detection with Multi-Scale Feature Fusion

A robot grasping detection network based on flexible selection of multi-modal feature fusion structure

Efficient Fully Convolutional Network and Optimization Approach for Robotic Grasping Detection Based on RGB-D Images

Residual Squeeze-and-Excitation Network with Multi-scale Spatial Pyramid Module for Fast Robotic Grasping Detection

Visual Manipulation Relationship Detection based on Gated Graph Neural Network for Robotic Grasping

A Vision-based Robot Grasping System

Multi-Object Grasping Detection with Hierarchical Feature Fusion

Bilateral Cross-Modal Fusion Network for Robot Grasp Detection

Robotic Grasp Detection Using Structure Prior Attention and Multiscale Features

Robotic Grasp Detection Network Based on Improved Deformable Convolution and Spatial Feature Center Mechanism

Joint Segmentation and Grasp Pose Detection with Multi-Modal Feature Fusion Network.

Deep learning for detecting robotic grasps

Object Recognition and Robot Grasping Technology Based on RGB-D Data

GraspFusionNet: a Two-Stage Multi-Parameter Grasp Detection Network Based on RGB–XYZ Fusion in Dense Clutter

Modular Anti-noise Deep Learning Network for Robotic Grasp Detection Based on RGB Images

RoI-based Robotic Grasp Detection in Object Overlapping Scenes Using Convolutional Neural Network.