3D Clothed Human Reconstruction in the Wild

Gyeongsik Moon,Hyeongjin Nam,Takaaki Shiratori,Kyoung Mu Lee

DOI: https://doi.org/10.48550/arXiv.2207.10053

2022-07-21

Abstract:Although much progress has been made in 3D clothed human reconstruction, most of the existing methods fail to produce robust results from in-the-wild images, which contain diverse human poses and appearances. This is mainly due to the large domain gap between training datasets and in-the-wild datasets. The training datasets are usually synthetic ones, which contain rendered images from GT 3D scans. However, such datasets contain simple human poses and less natural image appearances compared to those of real in-the-wild datasets, which makes generalization of it to in-the-wild images extremely challenging. To resolve this issue, in this work, we propose ClothWild, a 3D clothed human reconstruction framework that firstly addresses the robustness on in-thewild images. First, for the robustness to the domain gap, we propose a weakly supervised pipeline that is trainable with 2D supervision targets of in-the-wild datasets. Second, we design a DensePose-based loss function to reduce ambiguities of the weak supervision. Extensive empirical tests on several public in-the-wild datasets demonstrate that our proposed ClothWild produces much more accurate and robust results than the state-of-the-art methods. The codes are available in here: <a class="link-external link-https" href="https://github.com/hygenie1228/ClothWild_RELEASE" rel="external noopener nofollow">this https URL</a>.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to reconstruct the 3D model of a clothed human body from a single image in a complex real - world environment. Although great progress has been made in 3D clothed human body reconstruction, most of the existing methods do not work well when dealing with images in real - world environments, which contain diverse human postures and appearances. This is mainly due to the large domain gap between the training data set and the real - world environment data set. The training data set is usually a synthetic data set, containing rendered images generated from 3D scans. However, the human postures included in these data sets are simple and the image appearances are not natural enough. In contrast, the data sets in the real - world environment are more diverse, which makes it difficult for existing methods to generalize on real - world environment images. To solve this problem, the paper proposes ClothWild, a 3D clothed human body reconstruction framework aiming to improve the robustness to real - world environment images. Specifically, ClothWild improves the robustness to the domain gap in the following two aspects: 1. **Weakly - supervised Pipeline**: A weakly - supervised pipeline that can be trained using 2D supervision targets of the real - world environment data set is proposed. 2. **DensePose - based Loss Function**: A DensePose - based loss function is designed to reduce the ambiguity in weak supervision. Through these methods, tests on multiple publicly available real - world environment data sets show that ClothWild produces more accurate and robust results than the existing state - of - the - art methods.

3D Clothed Human Reconstruction in the Wild

CLOTH3D: Clothed 3D Humans

Innovative AI techniques for photorealistic 3D clothed human reconstruction from monocular images or videos: a survey

SelfRecon: Self Reconstruction Your Digital Avatar from Monocular Video

Multi-View Consistency Loss for Improved Single-Image 3D Reconstruction of Clothed People

Cloth2Body: Generating 3D Human Body Mesh from 2D Clothing

3D Body Shapes Estimation from Dressed-Human Silhouettes.

CloTH-VTON: Clothing Three-Dimensional Reconstruction for Hybrid Image-Based Virtual Try-ON

TeCH: Text-guided Reconstruction of Lifelike Clothed Humans

Towards Generalization of 3D Human Pose Estimation In The Wild

Garment4D: Garment Reconstruction from Point Cloud Sequences

SeSDF: Self-evolved Signed Distance Field for Implicit 3D Clothed Human Reconstruction

Implicit 3D Human Reconstruction Guided by Parametric Models and Normal Maps

DLCA-Recon: Dynamic Loose Clothing Avatar Reconstruction from Monocular Videos

Design2Cloth: 3D Cloth Generation from 2D Masks

ECON: Explicit Clothed humans Optimized via Normal integration

MultiGO: Towards Multi-level Geometry Learning for Monocular 3D Textured Human Reconstruction

GarVerseLOD: High-Fidelity 3D Garment Reconstruction from a Single In-the-Wild Image using a Dataset with Levels of Details

MOSS: Motion-based 3D Clothed Human Synthesis from Monocular Video

4D-DRESS: A 4D Dataset of Real-world Human Clothing with Semantic Annotations

HiLo: Detailed and Robust 3D Clothed Human Reconstruction with High-and Low-Frequency Information of Parametric Models