Abstract:The pixel-aligned implicit functions (IFs) enable the reconstruction of 3D human with complete and detailed clothing from a single RGB image. To enhance robustness for poses, existing work introduce the parametric body model as prior, but this limits the recovery of the geometry details and makes it challenging to handle loose clothing. Our goal is to reconstruct both clothing and pose that highly align with the input image, even in cases of peculiar poses and complex clothing. To achieve this, we propose a multi-scale features-based implicit method, called RICH, which combines the flexibility of implicit function and the powerful prior of parametric body model. RICH introduces a 3D human body model as prior knowledge and adopts local feature to constrain human body generation. Furthermore, RICH employs a pretrained image encoder to extract global pixel-aligned feature, which contributes to high-precision and complete reconstruction of clothing geometry and of the external appearance such as hair and accessories. Besides, by establishing connections with the joints of the body model, RICH utilizes an attention mechanism to construct relative spatial feature, thereby increasing the robustness for poses. Finally, RICH takes as input local, relative, and global feature to IF to query occupancy and the clothed human is represented by the 0.5 iso-surface of the 3D occupancy field. Quantitative and qualitative evaluation on the THuman2.0 and CAPE datasets shows that RICH outperforms the state-of-the-art methods. In particular, RICH demonstrates strong generalization ability on in-the-wild images, even under the scenarios of challenging poses and complex clothing. The code and supplementary material will be available at https://github.com/lyk412/RICH .

Fine Structure-Aware Sampling: A New Sampling Training Scheme for Pixel-Aligned Implicit Models in Single-View Human Reconstruction

IntegratedPIFu: Integrated Pixel Aligned Implicit Function for Single-view Human Reconstruction

Geometry-aware Two-scale PIFu Representation for Human Reconstruction

SelfRecon: Self Reconstruction Your Digital Avatar from Monocular Video

SIFU: Side-view Conditioned Implicit Function for Real-world Usable Clothed Human Reconstruction

Error-aware Sampling in Adaptive Shells for Neural Surface Reconstruction

PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization

Single-view 3D Body and Cloth Reconstruction under Complex Poses

Topology-Preserved Human Reconstruction with Details

SeSDF: Self-evolved Signed Distance Field for Implicit 3D Clothed Human Reconstruction

PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization

RICH: Robust Implicit Clothed Humans Reconstruction from Multi-scale Spatial Cues.

Pixel2ISDF: Implicit Signed Distance Fields based Human Body Model from Multi-view and Multi-pose Images

Implicit 3D Human Reconstruction Guided by Parametric Models and Normal Maps

Fine Back Surfaces Oriented Human Reconstruction for Single RGB‐D Images

Human as Points: Explicit Point-based 3D Human Reconstruction from Single-view RGB Images

ReFu: Refine and Fuse the Unobserved View for Detail-Preserving Single-Image 3D Human Reconstruction

Learning Pose Controllable Human Reconstruction with Dynamic Implicit Fields from a Single Image

Coarse-to-fine Multiview 3d Face Reconstruction Using Multiple Geometrical Features.

Multi‐Level Implicit Function for Detailed Human Reconstruction by Relaxing SMPL Constraints