Abstract:This paper presents a novel self-supervised approach to reconstruct human shape and pose from noisy point cloud data. Relying on large amount of dataset with ground-truth annotations, recent learning-based approaches predict correspondences for every vertice on the point cloud; Chamfer distance is usually used to minimize the distance between a deformed template model and the input point cloud. However, Chamfer distance is quite sensitive to noise and outliers, thus could be unreliable to assign correspondences. To address these issues, we model the probability distribution of the input point cloud as generated from a parametric human model under a Gaussian Mixture Model. Instead of explicitly aligning correspondences, we treat the process of correspondence search as an implicit probabilistic association by updating the posterior probability of the template model given the input. A novel self-supervised loss is further derived which penalizes the discrepancy between the deformed template and the input point cloud conditioned on the posterior probability. Our approach is very flexible, which works with both complete point cloud and incomplete ones including even a single depth image as input. Compared to previous self-supervised methods, our method shows the capability to deal with substantial noise and outliers. Extensive experiments conducted on various public synthetic datasets as well as a very noisy real dataset (i.e. CMU Panoptic) demonstrate the superior performance of our approach over the state-of-the-art methods.

What problem does this paper attempt to address?

This paper attempts to solve the problem of reconstructing human shapes and poses from noisy point - cloud data. Specifically, existing learning - based methods perform poorly when dealing with noisy and outlier - containing point - cloud data, and usually rely on a large number of labeled datasets for training. In addition, existing methods usually require complete point - clouds as input, which is uncommon in practical applications because the actually acquired point - clouds are often incomplete (for example, only containing a single depth image). To solve these problems, this paper proposes a new self - supervised method that uses a probability model to handle noise and outliers and can handle incomplete point - cloud data. ### Main problem summary: 1. **Sensitivity to noise and outliers**: Existing methods are very sensitive to noise and outliers, resulting in poor reconstruction effects. 2. **Dependence on complete point - clouds**: Most existing methods require complete point - clouds as input, which is difficult to meet in practical applications. 3. **Dependence on a large amount of labeled data**: Existing methods usually rely on a large amount of labeled data for training, and it is very difficult to obtain such data in practical applications. ### Solutions: - **Probability modeling**: Use a Gaussian Mixture Model (GMM) to model the probability distribution of the input point - cloud instead of directly regressing one - to - one correspondences. This can handle noise and outliers more flexibly. - **Self - supervised loss function**: Introduce a new self - supervised loss function that minimizes the difference between the deformed template and the input point - cloud based on the posterior probability. - **Handling incomplete point - clouds**: This method can handle incomplete point - cloud data, including cases where only a single depth image is included. Through these improvements, this method can more robustly reconstruct human shapes and poses in the presence of noise, outliers, and incomplete point - clouds.

Self-supervised 3D Human Mesh Recovery from Noisy Point Clouds

VoteHMR: Occlusion-Aware Voting Network for Robust 3D Human Mesh Recovery from Partial Point Clouds

Recovering 3D Human Mesh from Monocular Images: A Survey

MH‐HMR: Human mesh recovery from monocular images via multi‐hypothesis learning

Geometry-Driven Self-Supervised Method for 3D Human Pose Estimation

Sequential 3D Human Pose and Shape Estimation from Point Clouds

MHPro: Multi-hypothesis Probabilistic Modeling for Human Mesh Recovery

Human Mesh Recovery from Monocular Images via a Skeleton-disentangled Representation

Learning Human Mesh Recovery in 3D Scenes

Synthetic Training for Monocular Human Mesh Recovery

LiDAR-HMR: 3D Human Mesh Recovery from LiDAR

Weakly Supervised Adversarial Learning for 3D Human Pose Estimation from Point Clouds

Visibility-Aware Human Mesh Recovery Via Balancing Dense Correspondence and Probability Model

Skeleton-Aware 3d Human Shape Reconstruction From Point Clouds

Implicit 3D Human Mesh Recovery using Consistency with Pose and Shape from Unseen-view

End-to-end Recovery of Human Shape and Pose

Unsupervised Shape and Pose Disentanglement for 3D Meshes

CenterHMR: Multi-Person Center-based Human Mesh Recovery

Self-supervised Point Cloud Representation Learning Via Separating Mixed Shapes

Weakly Supervised 3D Human Pose and Shape Reconstruction with Normalizing Flows

Utilizing Uncertainty in 2D Pose Detectors for Probabilistic 3D Human Mesh Recovery