Abstract:Recent advances in 3D deep learning have shown that it is possible to train highly effective deep models for 3D shape generation, directly from 2D images. This is particularly interesting since the availability of 3D models is still limited compared to the massive amount of accessible 2D images, which is invaluable for training. The representation of 3D surfaces itself is a key factor for the quality and resolution of the 3D output. While explicit representations, such as point clouds and voxels, can span a wide range of shape variations, their resolutions are often limited. Mesh-based representations are more efficient but are limited by their ability to handle varying topologies. Implicit surfaces, however, can robustly handle complex shapes, topologies, and also provide flexible resolution control. We address the fundamental problem of learning implicit surfaces for shape inference without the need of 3D supervision. Despite their advantages, it remains nontrivial to (1) formulate a differentiable connection between implicit surfaces and their 2D renderings, which is needed for image-based supervision; and (2) ensure precise geometric properties and control, such as local smoothness. In particular, sampling implicit surfaces densely is also known to be a computationally demanding and very slow operation. To this end, we propose a novel ray-based field probing technique for efficient image-to-field supervision, as well as a general geometric regularizer for implicit surfaces, which provides natural shape priors in unconstrained regions. We demonstrate the effectiveness of our framework on the task of single-view image-based 3D shape digitization and show how we outperform state-of-the-art techniques both quantitatively and qualitatively.

Coarse-to-Fine Implicit Representation Learning for 3D Hand-Object Reconstruction from a Single RGB-D Image

In-Hand 3D Object Reconstruction from a Monocular RGB Video

3D Hand Reconstruction with Both Shape and Appearance from an RGB Image.

Geometric-aware RGB-D Representation Learning for Hand-Object Reconstruction

gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object Reconstruction

DDF-HO: Hand-Held Object Reconstruction via Conditional Directed Distance Field

AlignSDF: Pose-Aligned Signed Distance Fields for Hand-Object Reconstruction

Region-Aware Dynamic Filtering Network for 3D Hand Reconstruction

Geometry-aware Two-scale PIFu Representation for Human Reconstruction

Learning Explicit Contact for Implicit Reconstruction of Hand-held Objects from Monocular Images

Self-Supervised Implicit 3D Reconstruction via RGB-D Scans.

HandO: a Hybrid 3D Hand–object Reconstruction Model for Unknown Objects

HandNeRF: Learning to Reconstruct Hand-Object Interaction Scene from a Single RGB Image

IPoD: Implicit Field Learning with Point Diffusion for Generalizable 3D Object Reconstruction from Single RGB-D Images

Decoupled Iterative Refinement Framework for Interacting Hands Reconstruction from a Single RGB Image

Object-Compositional Neural Implicit Surfaces

3DFS: Deformable Dense Depth Fusion and Segmentation for Object Reconstruction from a Handheld Camera

3D Keypoint Estimation Using Implicit Representation Learning

SDF-SRN: Learning Signed Distance 3D Object Reconstruction from Static Images

D-SCo: Dual-Stream Conditional Diffusion for Monocular Hand-Held Object Reconstruction

Learning to Infer Implicit Surfaces without 3D Supervision