GeoNeRF: Generalizing NeRF with Geometry Priors

Mohammad Mahdi Johari,Yann Lepoittevin,François Fleuret
DOI: https://doi.org/10.48550/arXiv.2111.13539
2022-03-21
Abstract:We present GeoNeRF, a generalizable photorealistic novel view synthesis method based on neural radiance fields. Our approach consists of two main stages: a geometry reasoner and a renderer. To render a novel view, the geometry reasoner first constructs cascaded cost volumes for each nearby source view. Then, using a Transformer-based attention mechanism and the cascaded cost volumes, the renderer infers geometry and appearance, and renders detailed images via classical volume rendering techniques. This architecture, in particular, allows sophisticated occlusion reasoning, gathering information from consistent source views. Moreover, our method can easily be fine-tuned on a single scene, and renders competitive results with per-scene optimized neural rendering methods with a fraction of computational cost. Experiments show that GeoNeRF outperforms state-of-the-art generalizable neural rendering models on various synthetic and real datasets. Lastly, with a slight modification to the geometry reasoner, we also propose an alternative model that adapts to RGBD images. This model directly exploits the depth information often available thanks to depth sensors. The implementation code is available at <a class="link-external link-https" href="https://www.idiap.ch/paper/geonerf" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the limitations of the existing Neural Radiance Field (NeRF) technology when dealing with new scenes. Specifically: 1. **Limitations of NeRF**: - NeRF needs to be trained from scratch for each scene, which is both time - consuming and requires a large amount of densely - collected image data. - The optimization process for each scene is very long and has high computational costs. 2. **Deficiencies of existing improvement methods**: - Although some methods such as pixelNeRF, GRF, MINE, SRF, IBRNet, MVSNeRF and NeRFormer attempt to generalize NeRF by extracting features from source images, these methods have limited understanding of scene geometry and occlusion, resulting in unwanted artifacts in the rendering output. - MVSNeRF generalizes NeRF by constructing a low - resolution 3D cost volume, but has difficulties in rendering detailed images and dealing with occlusions in the scene. 3. **Goals of GeoNeRF**: - **Improve generalization ability**: GeoNeRF aims to improve the generalization ability of NeRF in unseen scenes by introducing a geometric reasoner and a renderer, reducing the need for per - scene optimization. - **Improve rendering quality**: By using cascaded cost volumes and attention mechanisms, GeoNeRF can better handle occlusions and details, thus generating higher - quality images. - **Reduce computational costs**: GeoNeRF can achieve results comparable to per - scene - optimized NeRF with a small amount of computational resources, significantly reducing training time and computational costs. In conclusion, GeoNeRF aims to solve the limitations of NeRF in generalization ability and rendering quality by improving geometric reasoning and rendering mechanisms, making it more efficient and accurate when dealing with new scenes.