Abstract:In this work, we propose a novel clothed human reconstruction method called GaussianBody, based on 3D Gaussian Splatting. Compared with the costly neural radiance based models, 3D Gaussian Splatting has recently demonstrated great performance in terms of training time and rendering quality. However, applying the static 3D Gaussian Splatting model to the dynamic human reconstruction problem is non-trivial due to complicated non-rigid deformations and rich cloth details. To address these challenges, our method considers explicit pose-guided deformation to associate dynamic Gaussians across the canonical space and the observation space, introducing a physically-based prior with regularized transformations helps mitigate ambiguity between the two spaces. During the training process, we further propose a pose refinement strategy to update the pose regression for compensating the inaccurate initial estimation and a split-with-scale mechanism to enhance the density of regressed point clouds. The experiments validate that our method can achieve state-of-the-art photorealistic novel-view rendering results with high-quality details for dynamic clothed human bodies, along with explicit geometry reconstruction.

What problem does this paper attempt to address?

### Problems the Paper Attempts to Solve The paper proposes a new method called **GaussianBody** for reconstructing dynamic dressed human models from monocular videos. Specifically, the paper aims to address the following issues: 1. **High-Fidelity Reconstruction**: - Traditional methods rely on complex capture systems or the tedious work of 3D artists, which are time-consuming and costly, making large-scale applications difficult. - Existing mesh-based methods (such as SMPL) can quickly reconstruct human shapes but struggle to capture complex geometric details and rich clothing features. 2. **Real-Time Rendering and Efficient Training**: - Implicit methods (such as NeRF) can improve reconstruction fidelity and rendering quality, but the complex volumetric rendering process leads to long training times, making real-time applications difficult. - Models based on implicit methods lack effective deformation schemes when dealing with complex body movements in dynamic sequences. 3. **Non-Rigid Deformation and Detail Capture**: - Dynamic dressed human models exhibit complex non-rigid deformations and rich clothing details, making it challenging for existing methods to accurately capture these details. To address the above challenges, the paper proposes the following key techniques: - **3D Gaussian Splatting**: Utilizes 3D Gaussian point cloud representation for efficient rendering and introduces explicit pose-guided deformation to map the Gaussian point cloud from canonical space to observation space. - **Physics-Based Priors**: Introduces physics-based priors to regularize Gaussian parameters in the observation space, avoiding overfitting issues. - **Pose Optimization and Point Cloud Enhancement**: Proposes a pose optimization strategy to correct initial pose estimation errors and employs a hierarchical mechanism to enhance point cloud density. Experimental results show that this method excels in detail reconstruction and geometric recovery, with short training times (about 1 hour) and near real-time rendering speed. Additionally, ablation experiments validate the effectiveness of each component.

GaussianBody: Clothed Human Reconstruction via 3d Gaussian Splatting

Gaussian Head & Shoulders: High Fidelity Neural Upper Body Avatars with Anchor Gaussian Guided Texture Warping

PBDyG: Position Based Dynamic Gaussians for Motion-Aware Clothed Human Avatars

Human Gaussian Splatting: Real-time Rendering of Animatable Avatars

HFGaussian: Learning Generalizable Gaussian Human with Integrated Human Features

SelfRecon: Self Reconstruction Your Digital Avatar from Monocular Video

3D Body Shapes Estimation from Dressed-Human Silhouettes.

GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers

SplatArmor: Articulated Gaussian splatting for animatable humans from monocular RGB videos

3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting

Innovative AI techniques for photorealistic 3D clothed human reconstruction from monocular images or videos: a survey

MultiGO: Towards Multi-level Geometry Learning for Monocular 3D Textured Human Reconstruction

MOSS: Motion-based 3D Clothed Human Synthesis from Monocular Video

Topology-aware Human Avatars with Semantically-guided Gaussian Splatting

Surfel-based Gaussian Inverse Rendering for Fast and Relightable Dynamic Human Reconstruction from Monocular Video

High-Resolution Volumetric Reconstruction for Clothed Humans

Animatable 3D Gaussian: Fast and High-Quality Reconstruction of Multiple Human Avatars

Gaussian Garments: Reconstructing Simulation-Ready Clothing with Photorealistic Appearance from Multi-View Video

Single-view 3D Body and Cloth Reconstruction under Complex Poses

GauHuman: Articulated Gaussian Splatting from Monocular Human Videos

DressRecon: Freeform 4D Human Reconstruction from Monocular Video