FAGD-Net: Feature-Augmented Grasp Detection Network Based on Efficient Multi-Scale Attention and Fusion Mechanisms
Xungao Zhong,Xianghui Liu,Tao Gong,Yuan Sun,Huosheng Hu,Qiang Liu
DOI: https://doi.org/10.3390/app14125097
2024-01-01
Abstract:Grasping robots always confront challenges such as uncertainties in object size, orientation, and type, necessitating effective feature augmentation to improve grasping detection performance. However, many prior studies inadequately emphasize grasp-related features, resulting in suboptimal grasping performance. To address this limitation, this paper proposes a new grasping approach termed the Feature-Augmented Grasp Detection Network (FAGD-Net). The proposed network incorporates two modules designed to enhance spatial information features and multi-scale features. Firstly, we introduce the Residual Efficient Multi-Scale Attention (Res-EMA) module, which effectively adjusts the importance of feature channels while preserving precise spatial information within those channels. Additionally, we present a Feature Fusion Pyramidal Module (FFPM) that serves as an intermediary between the encoder and decoder, effectively addressing potential oversights or losses of grasp-related features as the encoder network deepens. As a result, FAGD-Net achieved advanced levels of grasping accuracy, with 98.9% and 96.5% on the Cornell and Jacquard datasets, respectively. The grasp detection model was deployed on a physical robot for real-world grasping experiments, where we conducted a series of trials in diverse scenarios. In these experiments, we randomly selected various unknown household items and adversarial objects. Remarkably, we achieved high success rates, with a 95.0% success rate for single-object household items, 93.3% for multi-object scenarios, and 91.0% for cluttered scenes.