3DPointCaps++: Learning 3D Representations with Capsule Networks

Yongheng Zhao,Guangchi Fang,Yulan Guo,Leonidas Guibas,Federico Tombari,Tolga Birdal
DOI: https://doi.org/10.1007/s11263-022-01632-6
Abstract:We present 3DPointCaps++ for learning robust, flexible and generalizable 3D object representations without requiring heavy annotation efforts or supervision. Unlike conventional 3D generative models, our algorithm aims for building a structured latent space where certain factors of shape variations, such as object parts, can be disentangled into independent sub-spaces. Our novel decoder then acts on these individual latent sub-spaces (i.e. capsules) using deconvolution operators to reconstruct 3D points in a self-supervised manner. We further introduce a cluster loss ensuring that the points reconstructed by a single capsule remain local and do not spread across the object uncontrollably. These contributions allow our network to tackle the challenging tasks of part segmentation, part interpolation/replacement as well as correspondence estimation across rigid / non-rigid shape, and across / within category. Our extensive evaluations on ShapeNet objects and human scans demonstrate that our network can learn generic representations that are robust and useful in many applications.
What problem does this paper attempt to address?