Semantic Segmentation and 6DoF Pose Estimation using RGB-D Images and Deep Neural Networks

Van Luan Tran,Huei-Yung Lin
DOI: https://doi.org/10.1109/isie45552.2021.9576248
2021-06-20
Abstract:Recently, 6DoF object pose estimation for manipulation robots is an essential task in robotic and industrial applications. While deep learning methods have gained significant object detection and semantic segmentation development, the 6DoF pose estimation task is still challenging. Specifically, it is used with visual sensors to provide a robotic manipulator's information to interact with the target objects. Thus, 6DoF pose estimation and object recognition from point clouds or RGB-D images are essential tasks for visual servoing. This paper proposes a learning-based method for estimating 6DoF object pose for manipulation robots in industrial settings. A deep convolutional neural network (CNN) for semantic segmentation on RGB images is proposed. The target object area is determined by the network, which is then combined with depth knowledge to perform 6DoF object pose estimation using the ICP algorithm. With mIOU, we built our own dataset for training and assessment. As compared to other approaches that use a limited amount of training data, our proposed approach will provide better performance. For the robotic gripping application, we used an HIWIN 6-axis robot with an Asus Xtion Live 3D camera to test and validate our solution. We show the robotic grasping application using this method, which accurately estimates 6DoF object poses and has a high success rate in robotic grasping.
What problem does this paper attempt to address?