LEAP: Liberate Sparse-view 3D Modeling from Camera Poses

Hanwen Jiang,Zhenyu Jiang,Yue Zhao,Qixing Huang
2023-10-03
Abstract:Are camera poses necessary for multi-view 3D modeling? Existing approaches predominantly assume access to accurate camera poses. While this assumption might hold for dense views, accurately estimating camera poses for sparse views is often elusive. Our analysis reveals that noisy estimated poses lead to degraded performance for existing sparse-view 3D modeling methods. To address this issue, we present LEAP, a novel pose-free approach, therefore challenging the prevailing notion that camera poses are indispensable. LEAP discards pose-based operations and learns geometric knowledge from data. LEAP is equipped with a neural volume, which is shared across scenes and is parameterized to encode geometry and texture priors. For each incoming scene, we update the neural volume by aggregating 2D image features in a feature-similarity-driven manner. The updated neural volume is decoded into the radiance field, enabling novel view synthesis from any viewpoint. On both object-centric and scene-level datasets, we show that LEAP significantly outperforms prior methods when they employ predicted poses from state-of-the-art pose estimators. Notably, LEAP performs on par with prior approaches that use ground-truth poses while running $400\times$ faster than PixelNeRF. We show LEAP generalizes to novel object categories and scenes, and learns knowledge closely resembles epipolar geometry. Project page: <a class="link-external link-https" href="https://hwjiang1510.github.io/LEAP/" rel="external noopener nofollow">this https URL</a>
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is whether accurate camera pose information must be relied on in multi - view 3D modeling. Most of the existing methods assume that accurate camera poses can be obtained, but this assumption is often difficult to achieve in the case of sparse views, because accurately estimating camera poses under sparse views is very challenging. The paper points out that using inaccurate camera poses will lead to the performance degradation of existing sparse - view 3D modeling methods. Therefore, the paper proposes LEAP (Liberate Sparse - View 3D Modeling From Camera Poses), which is a brand - new pose - independent method, aiming to get rid of the dependence on camera poses, thus challenging the traditional view that camera poses are indispensable for 3D modeling. The main contributions of LEAP are as follows: 1. **Pose - independence**: LEAP abandons any operations that explicitly use camera poses, such as projection, etc., and instead learns pose - related geometric knowledge or representations from the data. 2. **Neural volume**: LEAP introduces a neural volume, which is shared among different scenes and parameterized to encode geometric and texture priors. For each input scene, the neural volume is updated through an aggregation method based on feature similarity. 3. **Fast inference**: LEAP can predict the radiance field in a single forward pass without an optimization process, which enables it to run in less than one second on a single consumer - level GPU. 4. **Strong generalization ability**: LEAP can accurately model objects of new categories, and the model trained on large object - centric datasets can be well transferred to the scene - level DTU dataset. In general, LEAP has successfully solved the key problem in 3D modeling under sparse views, that is, how to perform high - quality 3D modeling without accurate camera pose information, by proposing a new pose - independent paradigm. This method not only improves the performance of the model, but also significantly improves the inference speed and generalization ability.