Hyunseo Kim,Hyeonseo Yang,Taekyung Kim,YoonSung Kim,Jin-Hwa Kim,Byoung-Tak Zhang
Abstract:Active view selection in 3D scene reconstruction has been widely studied since training on informative views is critical for reconstruction. Recently, Neural Radiance Fields (NeRF) variants have shown promising results in active 3D reconstruction using uncertainty-guided view selection. They utilize uncertainties estimated with neural networks that encode scene geometry and appearance. However, the choice of uncertainty integration methods, either voxel-based or neural rendering, has conventionally depended on the types of scene uncertainty being estimated, whether geometric or appearance-related. In this paper, we introduce Colorized Surface Voxel (CSV)-based view selection, a new next-best view (NBV) selection method exploiting surface voxel-based measurement of uncertainty in scene appearance. CSV encapsulates the uncertainty of estimated scene appearance (e.g., color uncertainty) and estimated geometric information (e.g., surface). Using the geometry information, we interpret the uncertainty of scene appearance 3D-wise during the aggregation of the per-voxel uncertainty. Consequently, the uncertainty from occluded and complex regions is recognized under challenging scenarios with limited input data. Our method outperforms previous works on popular datasets, DTU and Blender, and our new dataset with imbalanced viewpoints, showing that the CSV-based view selection significantly improves performance by up to 30%.
What problem does this paper attempt to address?
This paper attempts to solve the problem of how to efficiently select the optimal view in 3D scene reconstruction. Specifically, the paper introduces a new view selection method based on Colorized Surface Voxel (CSV), aiming to select the Next - Best View (NBV) by measuring the color uncertainty of the scene appearance. The following are the main content and contributions of the paper:
### Research Background and Problem
In 3D scene reconstruction, active view selection is an important research direction, because training from views with rich information is crucial for reconstruction quality. Traditional NeRF variants select views through the guidance of uncertainty, but these methods usually rely on the estimation methods of geometric or appearance uncertainty, and different estimation methods are suitable for different types of scenes.
### Main Contributions
1. **Proposing CSV**: The author introduces Colorized Surface Voxel (CSV), which is a voxel grid containing color uncertainty and surface information. As a basic component for 3D - interpreting color uncertainty, CSV can more accurately reflect the uncertainty in the scene.
2. **Modifying the Neural Implicit Surface Network**: By modifying the neural implicit surface network, the color uncertainty and surface information are estimated simultaneously. This enables the geometric and color information to be associated in CSV.
3. **CSV - based View Selection**: A CSV - based view selection method is proposed, which uses surface information to appropriately aggregate the color uncertainty after 3D - interpretation. The calculated image uncertainty can accurately reflect the reducible color uncertainty visible in the view.
4. **New Dataset**: A new dataset with unbalanced views is provided, reflecting the real situation of the data collection environment in the real world.
### Method Overview
- **Estimation of Color Uncertainty and Surface Information**: By modeling the color of 3D points as a Gaussian distribution and using the neural implicit surface network to predict the signed distance function (SDF) and color uncertainty of each point.
- **Construction of CSV**: Map the color uncertainty and surface information to the CSV grid, where each voxel has an attribute (surfaceness) indicating whether it belongs to the surface. If a voxel belongs to the surface, it is called a surface voxel.
- **View Selection Strategy**: Depending on whether the voxel belongs to the surface, different uncertainty aggregation methods are adopted. In the early training stage, considering surface voxels instead of individual points can improve robustness; when there are no clear surface voxels, aggregate the uncertainty of all voxels traversed by the ray.
### Experimental Results
The author conducted experiments on the DTU and Blender datasets, and the results show that the CSV - based view selection method significantly outperforms other methods and improves performance in image rendering and mesh reconstruction respectively. Especially in the case of handling occluded objects and unbalanced views, the CSV - based method performs excellently.
### Formula Summary
- **Color Model under Gaussian Distribution**:
\[
c(r(t))\sim N(\bar{c}(r(t)),\beta^{2}(r(t)))
\]
- **Definition of Color Entropy**:
\[
H(c(r(t))) = - E[\log N(\bar{c}(r(t)),\beta^{2}(r(t)))]=\frac{1}{2}\log(2\pi\beta^{2})+\frac{1}{2}
\]
- **Information Gain Formula**:
\[
G_{s}(v)=\frac{1}{n}\sum_{\forall r\in R_{v}}\sum_{\forall x\in X_{r}\cap S}H(c(x))
\]
In conclusion, this paper provides an efficient and robust view selection method by introducing CSV and an improved neural implicit surface network, which significantly improves the quality of 3D scene reconstruction.