In-Hand 3D Object Reconstruction from a Monocular RGB Video

Shijian Jiang,Qi Ye,Rengan Xie,Yuchi Huo,Xiang Li,Yang Zhou,Jiming Chen
DOI: https://doi.org/10.1609/aaai.v38i3.28029
2024-01-01
Proceedings of the AAAI Conference on Artificial Intelligence
Abstract:Our work aims to reconstruct a 3D object that is held and rotated by a handin front of a static RGB camera. Previous methods that use implicit neuralrepresentations to recover the geometry of a generic hand-held object frommulti-view images achieved compelling results in the visible part of theobject. However, these methods falter in accurately capturing the shape withinthe hand-object contact region due to occlusion. In this paper, we propose anovel method that deals with surface reconstruction under occlusion byincorporating priors of 2D occlusion elucidation and physical contactconstraints. For the former, we introduce an object amodal completion networkto infer the 2D complete mask of objects under occlusion. To ensure theaccuracy and view consistency of the predicted 2D amodal masks, we devise ajoint optimization method for both amodal mask refinement and 3Dreconstruction. For the latter, we impose penetration and attractionconstraints on the local geometry in contact regions. We evaluate our approachon HO3D and HOD datasets and demonstrate that it outperforms thestate-of-the-art methods in terms of reconstruction surface quality, with animprovement of 52% on HO3D and 20% on HOD. Project webpage:https://east-j.github.io/ihor.
What problem does this paper attempt to address?