Deep Learning-Based Ensemble Approach for Autonomous Object Manipulation with an Anthropomorphic Soft Robot Hand

Edwin Valarezo Añazco,Sara Guerrero,Patricio Rivera Lopez,Ji-Heon Oh,Ga-Hyeon Ryu,Tae-Seong Kim
DOI: https://doi.org/10.3390/electronics13020379
IF: 2.9
2024-01-18
Electronics
Abstract:Autonomous object manipulation is a challenging task in robotics because it requires an essential understanding of the object's parameters such as position, 3D shape, grasping (i.e., touching) areas, and orientation. This work presents an autonomous object manipulation system using an anthropomorphic soft robot hand with deep learning (DL) vision intelligence for object detection, 3D shape reconstruction, and object grasping area generation. Object detection is performed using Faster-RCNN and an RGB-D sensor to produce a partial depth view of the objects randomly located in the working space. Three-dimensional object shape reconstruction is performed using U-Net based on 3D convolutions with bottle-neck layers and skip connections generating a complete 3D shape of the object from the sensed single-depth view. Then, the grasping position and orientation are computed based on the reconstructed 3D object information (e.g., object shape and size) using U-Net based on 3D convolutions and Principal Component Analysis (PCA), respectively. The proposed autonomous object manipulation system is evaluated by grasping and relocating twelve objects not included in the training database, achieving an average of 95% successful object grasping and 93% object relocations.
engineering, electrical & electronic,computer science, information systems,physics, applied
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenges faced by autonomous robotic hand systems during object manipulation. Specifically, the paper aims to develop an integrated deep - learning - based method for autonomously using human - like soft robotic hands to detect, recognize, reconstruct 3D shapes, and determine the grasping areas and orientations of objects, thereby achieving the grasping and re - positioning of objects of different shapes and sizes. The system proposed in the paper solves this problem through the following steps: 1. **Object Detection**: Use Faster - RCNN and RGB - D sensors to detect randomly placed objects in the workspace and generate partial depth views of these objects. 2. **3D Object Shape Reconstruction**: Based on the 3D - convolution - based U - Net architecture, combined with bottleneck layers and skip connections, reconstruct the complete 3D shape of the object from a single depth view. 3. **Calculation of Grasping Positions and Orientations**: Utilize the 3D - convolution - based U - Net architecture and principal component analysis (PCA) to calculate the grasping positions and orientations respectively according to the reconstructed 3D object information (such as shape and size). The effectiveness of the autonomous object - operating system proposed in the paper was verified by grasping and re - positioning 12 objects not included in the training database, ultimately achieving an average 95% successful grasping rate and 93% re - positioning success rate. This indicates that the system has high robustness and accuracy when dealing with unseen objects.