CenterArt: Joint Shape Reconstruction and 6-DoF Grasp Estimation of Articulated Objects

Sassan Mokhtar,Eugenio Chisari,Nick Heppert,Abhinav Valada
2024-04-23
Abstract:Precisely grasping and reconstructing articulated objects is key to enabling general robotic manipulation. In this paper, we propose CenterArt, a novel approach for simultaneous 3D shape reconstruction and 6-DoF grasp estimation of articulated objects. CenterArt takes RGB-D images of the scene as input and first predicts the shape and joint codes through an encoder. The decoder then leverages these codes to reconstruct 3D shapes and estimate 6-DoF grasp poses of the objects. We further develop a mechanism for generating a dataset of 6-DoF grasp ground truth poses for articulated objects. CenterArt is trained on realistic scenes containing multiple articulated objects with randomized designs, textures, lighting conditions, and realistic depths. We perform extensive experiments demonstrating that CenterArt outperforms existing methods in accuracy and robustness.
Robotics,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper proposes a solution to the problem of simultaneous 3D shape reconstruction and 6 degrees of freedom (6-DoF) grasp pose estimation for complex mechanical objects, such as articulated objects. Existing methods mainly rely on reinforcement learning, requiring a large amount of data and training time, and have limited generalization ability in practical scenarios. In this paper, the researchers propose a new method called CenterArt, which is based on RGB-D images as input. The encoder predicts shape and joint encoding, and the decoder utilizes these encodings to reconstruct 3D shape and estimate 6-DoF grasp pose. The innovations of CenterArt include: 1. The first method proposed for simultaneously performing 3D shape reconstruction and 6-DoF grasp pose estimation. 2. Development of a dataset for the ground truth grasp poses of articulated objects. 3. Utilization of the Sapien simulator to create realistic kitchen scenes with multiple articulated objects. The experiments show that CenterArt outperforms existing methods in terms of accuracy and robustness, especially when dealing with real depth images and complex multi-object scenes. Moreover, compared to reinforcement learning-based methods, CenterArt has advantages in terms of training time and data requirements.