Is Vanilla MLP in Neural Radiance Field Enough for Few-shot View Synthesis?

Hanxin Zhu,Tianyu He,Xin Li,Bingchen Li,Zhibo Chen
DOI: https://doi.org/10.1109/cvpr52733.2024.01918
2024-01-01
Computer Vision and Pattern Recognition
Abstract:Neural Radiance Field (NeRF) has achieved superior performance for novel viewsynthesis by modeling the scene with a Multi-Layer Perception (MLP) and avolume rendering procedure, however, when fewer known views are given (i.e.,few-shot view synthesis), the model is prone to overfit the given views. Tohandle this issue, previous efforts have been made towards leveraging learnedpriors or introducing additional regularizations. In contrast, in this paper,we for the first time provide an orthogonal method from the perspective ofnetwork structure. Given the observation that trivially reducing the number ofmodel parameters alleviates the overfitting issue, but at the cost of missingdetails, we propose the multi-input MLP (mi-MLP) that incorporates the inputs(i.e., location and viewing direction) of the vanilla MLP into each layer toprevent the overfitting issue without harming detailed synthesis. To furtherreduce the artifacts, we propose to model colors and volume density separatelyand present two regularization terms. Extensive experiments on multipledatasets demonstrate that: 1) although the proposed mi-MLP is easy toimplement, it is surprisingly effective as it boosts the PSNR of the baselinefrom 14.73 to 24.23. 2) the overall framework achieves state-of-the-artresults on a wide range of benchmarks. We will release the code uponpublication.
What problem does this paper attempt to address?