Abstract:Purpose This paper aims to use fully convolutional network (FCN) to predict pixel-wise antipodal grasp affordances for unknown objects and improve the grasp detection performance through multi-scale feature fusion. Design/methodology/approach A modified FCN network is used as the backbone to extract pixel-wise features from the input image, which are further fused with multi-scale context information gathered by a three-level pyramid pooling module to make more robust predictions. Based on the proposed unify feature embedding framework, two head networks are designed to implement different grasp rotation prediction strategies (regression and classification), and their performances are evaluated and compared with a defined point metric. The regression network is further extended to predict the grasp rectangles for comparisons with previous methods and real-world robotic grasping of unknown objects. Findings The ablation study of the pyramid pooling module shows that the multi-scale information fusion significantly improves the model performance. The regression approach outperforms the classification approach based on same feature embedding framework on two data sets. The regression network achieves a state-of-the-art accuracy (up to 98.9%) and speed (4 ms per image) and high success rate (97% for household objects, 94.4% for adversarial objects and 95.3% for objects in clutter) in the unknown object grasping experiment. Originality/value A novel pixel-wise grasp affordance prediction network based on multi-scale feature fusion is proposed to improve the grasp detection performance. Two prediction approaches are formulated and compared based on the proposed framework. The proposed method achieves excellent performances on three benchmark data sets and real-world robotic grasping experiment.

Cascaded Feature Fusion Grasping Network for Real-Time Robotic Systems

A Cascaded Deep Learning Framework for Real-time and Robust Grasp Planning

Residual Squeeze-and-Excitation Network with Multi-scale Spatial Pyramid Module for Fast Robotic Grasping Detection

Recurrent Volume-based 3D Feature Fusion for Real-time Multi-view Object Pose Estimation

Efficient Grasp Detection Network with Gaussian-Based Grasp Representation for Robotic Manipulation

FFBGNet:Full-Flow Bidirectional Feature Fusion Grasp Detection Network Based on Hybrid Architecture

GraspFusionNet: a Two-Stage Multi-Parameter Grasp Detection Network Based on RGB–XYZ Fusion in Dense Clutter

Efficient Fully Convolutional Network and Optimization Approach for Robotic Grasping Detection Based on RGB-D Images

Real-time Pixel-Wise Grasp Affordance Prediction Based on Multi-Scale Context Information Fusion

Robotic Grasp Detection Method Based on Lightweight Feature Fusion Convolutional Neural Network

Real-Time Robotic Grasp Detection with Multi-Scale Feature Fusion

Real-Time Pixel-Wise Grasp Detection Based on RGB-D Feature Dense Fusion

Rotation adaptive grasping estimation network oriented to unknown objects based on novel RGB-D fusion strategy

FAGD-Net: Feature-Augmented Grasp Detection Network Based on Efficient Multi-Scale Attention and Fusion Mechanisms

Enhancement ofreal-timegrasp detection by cascaded deep convolutional neural networks

RFFCE: Residual Feature Fusion and Confidence Evaluation Network for 6dof Pose Estimation.

DSC-GraspNet: A Lightweight Convolutional Neural Network for Robotic Grasp Detection

FCNN-GraspNet: a Steamlined Neural Network for Robotic Grasp Detection

Joint Segmentation and Grasp Pose Detection with Multi-Modal Feature Fusion Network.

High-performance Pixel-level Grasp Detection Based on Adaptive Grasping and Grasp-aware Network

A Depth Adaptive Feature Extraction and Dense Prediction Network for 6-D Pose Estimation in Robotic Grasping