TD-NeRF: Novel Truncated Depth Prior for Joint Camera Pose and Neural Radiance Field Optimization

Zhen Tan,Zongtan Zhou,Yangbing Ge,Zi Wang,Xieyuanli Chen,Dewen Hu
2024-10-07
Abstract:The reliance on accurate camera poses is a significant barrier to the widespread deployment of Neural Radiance Fields (NeRF) models for 3D reconstruction and SLAM tasks. The existing method introduces monocular depth priors to jointly optimize the camera poses and NeRF, which fails to fully exploit the depth priors and neglects the impact of their inherent noise. In this paper, we propose Truncated Depth NeRF (TD-NeRF), a novel approach that enables training NeRF from unknown camera poses - by jointly optimizing learnable parameters of the radiance field and camera poses. Our approach explicitly utilizes monocular depth priors through three key advancements: 1) we propose a novel depth-based ray sampling strategy based on the truncated normal distribution, which improves the convergence speed and accuracy of pose estimation; 2) to circumvent local minima and refine depth geometry, we introduce a coarse-to-fine training strategy that progressively improves the depth precision; 3) we propose a more robust inter-frame point constraint that enhances robustness against depth noise during training. The experimental results on three datasets demonstrate that TD-NeRF achieves superior performance in the joint optimization of camera pose and NeRF, surpassing prior works, and generates more accurate depth geometry. The implementation of our method has been released at <a class="link-external link-https" href="https://github.com/nubot-nudt/TD-NeRF" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Artificial Intelligence,Robotics
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper titled "TD-NeRF: Joint Camera Pose and Neural Radiance Field Optimization with Truncated Depth Priors" aims to address the issue of NeRF (Neural Radiance Fields) models' high dependency on precise camera poses in 3D reconstruction and SLAM tasks. Existing methods introduce monocular depth priors to jointly optimize camera poses and NeRF, but they fail to fully utilize the depth priors and ignore their inherent noise. Specifically, the paper proposes the following improvements: 1. **Truncated Depth Priors**: A new depth sampling strategy based on truncated normal distribution (Truncated Depth-Based Sampling, TDBS) is proposed to improve the convergence speed and accuracy of pose estimation. 2. **Coarse-to-Fine Training Strategy**: A coarse-to-fine training strategy is introduced to gradually improve depth accuracy and avoid local minima. 3. **More Robust Inter-Frame Point Constraints**: A more robust inter-frame point constraint is proposed to enhance resistance to depth noise during training. With these improvements, the paper aims to train NeRF from unknown camera poses and generate more accurate depth geometric structures in both indoor and outdoor scenes. Experimental results show that TD-NeRF surpasses existing methods in jointly optimizing camera poses and NeRF, and generates more accurate depth geometric structures.