TriNeRFLet: A Wavelet Based Triplane NeRF Representation

Rajaei Khatib,Raja Giryes
2024-07-18
Abstract:In recent years, the neural radiance field (NeRF) model has gained popularity due to its ability to recover complex 3D scenes. Following its success, many approaches proposed different NeRF representations in order to further improve both runtime and performance. One such example is Triplane, in which NeRF is represented using three 2D feature planes. This enables easily using existing 2D neural networks in this framework, e.g., to generate the three planes. Despite its advantage, the triplane representation lagged behind in its 3D recovery quality compared to NeRF solutions. In this work, we propose TriNeRFLet, a 2D wavelet-based multiscale triplane representation for NeRF, which closes the 3D recovery performance gap and is competitive with current state-of-the-art methods. Building upon the triplane framework, we also propose a novel super-resolution (SR) technique that combines a diffusion model with TriNeRFLet for improving NeRF resolution.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper aims to address the issues present in Neural Radiance Fields (NeRF) models for 3D scene reconstruction, particularly focusing on the shortcomings of the Triplane representation in 3D reconstruction quality. Although the Triplane method leverages the advantages of 2D feature planes and can conveniently apply existing 2D neural networks, it lags behind other efficient multi-view reconstruction methods in terms of 3D scene recovery quality. To solve the aforementioned problems, the authors propose the TriNeRFLet framework, a Triplane representation based on wavelet transforms. The main contributions of this framework are: 1. **Improved Triplane Representation**: By learning wavelet representations of feature planes and regularizing them, the method enables information sharing across different scales and regularizes high-frequency components. This approach not only effectively utilizes the regions covered by training views but also updates regions not covered by training views through lower resolution estimates, thereby improving overall reconstruction quality. 2. **Multi-Scale Learning**: Utilizing the multi-scale nature of wavelets for multi-scale training, starting from low resolution and gradually increasing the resolution, helps reduce training time and improve reconstruction results. 3. **NeRF Super-Resolution**: By combining a pre-trained 2D diffusion model, a new method for NeRF super-resolution is proposed. This method enhances the resolution of NeRF without the need for specialized training on multi-view or 3D data. 4. **Experimental Validation**: Through experimental results on the Blender and LLFF datasets, the superiority of TriNeRFLet in 3D scene reconstruction quality and NeRF super-resolution tasks is demonstrated. In summary, TriNeRFLet aims to improve the Triplane representation by introducing wavelet transforms, significantly enhancing the quality of 3D scene reconstruction, and providing an effective solution for NeRF super-resolution.