Abstract:Reconstructing high-fidelity hand models with intricate textures plays a crucial role in enhancing human-object interaction and advancing real-world applications. Despite the state-of-the-art methods excelling in texture generation and image rendering, they often face challenges in accurately capturing geometric details. Learning-based approaches usually offer better robustness and faster inference, which tend to produce smoother results and require substantial amounts of training data. To address these issues, we present a novel fine-grained multi-view hand mesh reconstruction method that leverages inverse rendering to restore hand poses and intricate details. Firstly, our approach predicts a parametric hand mesh model through Graph Convolutional Networks (GCN) based method from multi-view images. We further introduce a novel Hand Albedo and Mesh (HAM) optimization module to refine both the hand mesh and textures, which is capable of preserving the mesh topology. In addition, we suggest an effective mesh-based neural rendering scheme to simultaneously generate photo-realistic image and optimize mesh geometry by fusing the pre-trained rendering network with vertex features. We conduct the comprehensive experiments on InterHand2.6M, DeepHandMesh and dataset collected by ourself, whose promising results show that our proposed approach outperforms the state-of-the-art methods on both reconstruction accuracy and rendering quality. Code and dataset are publicly available at <a class="link-external link-https" href="https://github.com/agnJason/FMHR" rel="external noopener nofollow">this https URL</a>.

CAMInterHand: Cooperative Attention for Multi-View Interactive Hand Pose and Mesh Reconstruction

In-Hand 3D Object Reconstruction from a Monocular RGB Video

Personalized Hand Modeling from Multiple Postures with Multi‐View Color Images

Decoupled Iterative Refinement Framework for Interacting Hands Reconstruction from a Single RGB Image

Interacting Attention Graph for Single Image Two-Hand Reconstruction

MVHANet: Multi-view Hierarchical Aggregation Network for Skeleton-Based Hand Gesture Recognition

SiMA-Hand: Boosting 3D Hand-Mesh Reconstruction by Single-to-Multi-View Adaptation

Fine-Grained Multi-View Hand Reconstruction Using Inverse Rendering

Multi-view Hand Reconstruction with a Point-Embedded Transformer

MLPHand: Real Time Multi-View 3D Hand Mesh Reconstruction via MLP Modeling

Exploiting Spatial-Temporal Context for Interacting Hand Reconstruction on Monocular RGB Video

MLPHand: Real Time Multi-View 3D Hand Reconstruction Via MLP Modeling

Single Depth View Based Real-Time Reconstruction of Hand-Object Interactions

OmniHands: Towards Robust 4D Hand Mesh Recovery via A Versatile Transformer

3D Hand Reconstruction via Aggregating Intra and Inter Graphs Guided by Prior Knowledge for Hand-Object Interaction Scenario

Real-time pose and shape reconstruction of two interacting hands with a single depth camera

Reconstructing Interacting Hands with Interaction Prior from Monocular Images

HandGCAT: Occlusion-Robust 3D Hand Mesh Reconstruction from Monocular Images.

MOHO: Learning Single-view Hand-held Object Reconstruction with Multi-view Occlusion-Aware Supervision

Two Hands Are Better Than One: Resolving Hand to Hand Intersections via Occupancy Networks