Abstract:Consumer-level RGB-D cameras have been widely used for dense 3D reconstruction of scenes. Especially for textureless or non-lambertian surfaces, consumer RGB-D cameras can ensure completeness of the reconstructed models at a low cost. However, the reconstruction quality relies heavily on the accuracy of the depth sensors. Digital cameras are also used popularly for capturing high-resolution pictures to achieve high-quality dense reconstruction of the scenes, but cannot handle textureless or non-lambertian regions well due to the visual ambiguity problem. To ensure both completeness and accuracy of the reconstructed 3D models, we propose a hybrid multi-view reconstruction pipeline named Hybrid-MVS, which combines the high-resolution images taken by a digital camera and the low-resolution RGB-D frames captured by a consumer RGB-D camera for robust reconstruction of complicated scenes with challenging textureless and non-lambertian surfaces. Unlike most existing multi-sensor systems which require explicit hardware calibration and synchronization of various sensors, the calibration and synchronization problems between the digital camera and RGB-D camera are implicitly solved for compositing reliable depth prior of the digital images in our pipeline. Especially, we propose a hybrid MVS framework for robust PatchMatch stereo and Delaunay meshing, which tightly couples both visual cues given by the digital images and depth cues from the RGB-D frames to maximize the complementary advantages. The experiments with quantitative and qualitative evaluations demonstrate the effectiveness of the proposed Hybrid-MVS framework, which can successfully achieve high-quality 3D reconstruction of complicated natural scenes with robustness to weakly textured and non-lambertian areas.

SS-MVMETRO: Semi-supervised multi-view human mesh recovery transformer

Human Mesh Recovery from Arbitrary Multi-view Images

Multi-view Human Body Mesh Translator

Cross-Attention of Disentangled Modalities for 3D Human Mesh Recovery with Transformers

End-to-End Human Pose and Mesh Reconstruction with Transformers

Hybrid-MVS: Robust Multi-View Reconstruction with Hybrid Optimization of Visual and Depth Cues

Human Mesh Recovery from Monocular Images via a Skeleton-disentangled Representation

Distribution and Depth-Aware Transformers for 3D Human Mesh Recovery

PostoMETRO: Pose Token Enhanced Mesh Transformer for Robust 3D Human Mesh Recovery

Geometry-Biased Transformer for Robust Multi-View 3D Human Pose Reconstruction

MH‐HMR: Human mesh recovery from monocular images via multi‐hypothesis learning

Multi-View Human Mesh Reconstruction via Direction-Aware Feature Fusion

Multiview Textured Mesh Recovery by Differentiable Rendering

Synthetic Training for Monocular Human Mesh Recovery

3D Human Mesh Recovery with Sequentially Global Rotation Estimation

Mixed Transformer for Temporal 3D Human Pose and Shape Estimation from Monocular Video

Recovering 3D Human Mesh from Monocular Images: A Survey

Human Mesh Reconstruction with Generative Adversarial Networks from Single RGB Images

Direct Multi-view Multi-person 3D Pose Estimation

Delving Deep into Pixel Alignment Feature for Accurate Multi-view Human Mesh Recovery

Towards Robust RGB-D Human Mesh Recovery