RICH: Robust Implicit Clothed Humans Reconstruction from Multi-scale Spatial Cues.
Yukang Lin,Ronghui Li,Kedi Lyu,Yachao Zhang,Xiu Li
DOI: https://doi.org/10.1007/978-981-99-8432-9_16
2024-01-01
Abstract:The pixel-aligned implicit functions (IFs) enable the reconstruction of 3D human with complete and detailed clothing from a single RGB image. To enhance robustness for poses, existing work introduce the parametric body model as prior, but this limits the recovery of the geometry details and makes it challenging to handle loose clothing. Our goal is to reconstruct both clothing and pose that highly align with the input image, even in cases of peculiar poses and complex clothing. To achieve this, we propose a multi-scale features-based implicit method, called RICH, which combines the flexibility of implicit function and the powerful prior of parametric body model. RICH introduces a 3D human body model as prior knowledge and adopts local feature to constrain human body generation. Furthermore, RICH employs a pretrained image encoder to extract global pixel-aligned feature, which contributes to high-precision and complete reconstruction of clothing geometry and of the external appearance such as hair and accessories. Besides, by establishing connections with the joints of the body model, RICH utilizes an attention mechanism to construct relative spatial feature, thereby increasing the robustness for poses. Finally, RICH takes as input local, relative, and global feature to IF to query occupancy and the clothed human is represented by the 0.5 iso-surface of the 3D occupancy field. Quantitative and qualitative evaluation on the THuman2.0 and CAPE datasets shows that RICH outperforms the state-of-the-art methods. In particular, RICH demonstrates strong generalization ability on in-the-wild images, even under the scenarios of challenging poses and complex clothing. The code and supplementary material will be available at https://github.com/lyk412/RICH .