Learning Canonical Shape Space for Category-Level 6D Object Pose and Size Estimation

Dengsheng Chen,Jun Li,Zheng Wang,Kai Xu
DOI: https://doi.org/10.1109/cvpr42600.2020.01199
2020-01-01
Abstract:We present a novel approach to category-level 6D object pose and sizeestimation. To tackle intra-class shape variations, we learn canonical shapespace (CASS), a unified representation for a large variety of instances of acertain object category. In particular, CASS is modeled as the latent space ofa deep generative model of canonical 3D shapes with normalized pose. We train avariational auto-encoder (VAE) for generating 3D point clouds in the canonicalspace from an RGBD image. The VAE is trained in a cross-category fashion,exploiting the publicly available large 3D shape repositories. Since the 3Dpoint cloud is generated in normalized pose (with actual size), the encoder ofthe VAE learns view-factorized RGBD embedding. It maps an RGBD image inarbitrary view into a pose-independent 3D shape representation. Object pose isthen estimated via contrasting it with a pose-dependent feature of the inputRGBD extracted with a separate deep neural networks. We integrate the learningof CASS and pose and size estimation into an end-to-end trainable network,achieving the state-of-the-art performance.
What problem does this paper attempt to address?