VIPNet: A Fast and Accurate Single-View Volumetric Reconstruction by Learning Sparse Implicit Point Guidance

Dong Du,Zhiyi Zhang,Xiaoguang Han,Shuguang Cui,Ligang Liu
DOI: https://doi.org/10.1109/3dv50981.2020.00065
2020-01-01
Abstract:With the advent of deep neural networks, learning-based single-view reconstruction has gained popularity. However, in 3D, there is no absolutely dominant representation that is both computationally efficient and accurate yet allows for reconstructing high-resolution geometry of arbitrary topology. After all, the accurate implicit methods are time-consuming due to dense sampling and inference, while volumetric approaches are fast but limited to heavy memory usage and low accuracy. In this paper, we propose VIPNet, an end-to-end hybrid representation learning for fast and accurate single-view reconstruction under sparse implicit point guidance. Given an image, it first generates a volumetric result. Meanwhile, a corresponding implicit shape representation is learned. To balance the efficiency and accuracy, we adopt PointGenNet to learn some representative points for guiding the voxel refinement with the corresponding sparse implicit inference. A strategy of patch-based synthesis with global-local features under implicit guidance is also applied for reducing memory consumption required to generate high-resolution output. Extensive experiments demonstrate the effectiveness of our method both qualitatively and quantitatively, which indicates that our proposed hybrid learning outperforms separate representation learning. Specifically, our network not only runs 60 times faster than implicit methods but also contributes to accuracy gains. We hope it will inspire a re-thinking of hybrid representation learning.
What problem does this paper attempt to address?