3DFIRES: Few Image 3D REconstruction for Scenes with Hidden Surface

Linyi Jin,Nilesh Kulkarni,David Fouhey
2024-03-14
Abstract:This paper introduces 3DFIRES, a novel system for scene-level 3D reconstruction from posed images. Designed to work with as few as one view, 3DFIRES reconstructs the complete geometry of unseen scenes, including hidden surfaces. With multiple view inputs, our method produces full reconstruction within all camera frustums. A key feature of our approach is the fusion of multi-view information at the feature level, enabling the production of coherent and comprehensive 3D reconstruction. We train our system on non-watertight scans from large-scale real scene dataset. We show it matches the efficacy of single-view reconstruction methods with only one input and surpasses existing techniques in both quantitative and qualitative measures for sparse-view 3D reconstruction.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper proposes a new system called 3DFIRES for scene-level 3D reconstruction from a small number of images, including the reconstruction of hidden surfaces. Existing computer vision methods struggle to accurately infer 3D structures from single or few images, especially when objects are occluded. 3DFIRES tackles this problem by fusing feature information from different images, allowing it to generate coherent and comprehensive 3D reconstructions. Researchers train the 3DFIRES system using an open-ended real-world dataset, enabling it to predict all surfaces in the scene, including both visible and occluded parts, even with only one image input. As the number of input images increases, 3DFIRES can reconstruct the joint region of camera frusta. The key innovation lies in the fusion of feature space, enabling the system to identify how to best utilize available image data for accurate 3D reconstruction. Compared to methods that solely rely on single-image reconstruction, 3DFIRES performs better in both quantitative and qualitative evaluations, particularly in the reconstruction of hidden regions. Additionally, it can adapt to varying numbers of input views and handle camera poses estimated by LoFTR. Through this approach, 3DFIRES is capable of generating high-quality 3D reconstructions, overcoming the limitations and geometric distortion issues of existing techniques when dealing with sparse view inputs.