CRAYM: Neural Field Optimization via Camera RAY Matching

Liqiang Lin,Wenpeng Wu,Chi-Wing Fu,Hao Zhang,Hui Huang
2024-12-02
Abstract:We introduce camera ray matching (CRAYM) into the joint optimization of camera poses and neural fields from multi-view images. The optimized field, referred to as a feature volume, can be "probed" by the camera rays for novel view synthesis (NVS) and 3D geometry reconstruction. One key reason for matching camera rays, instead of pixels as in prior works, is that the camera rays can be parameterized by the feature volume to carry both geometric and photometric information. Multi-view consistencies involving the camera rays and scene rendering can be naturally integrated into the joint optimization and network training, to impose physically meaningful constraints to improve the final quality of both the geometric reconstruction and photorealistic rendering. We formulate our per-ray optimization and matched ray coherence by focusing on camera rays passing through keypoints in the input images to elevate both the efficiency and accuracy of scene correspondences. Accumulated ray features along the feature volume provide a means to discount the coherence constraint amid erroneous ray matching. We demonstrate the effectiveness of CRAYM for both NVS and geometry reconstruction, over dense- or sparse-view settings, with qualitative and quantitative comparisons to state-of-the-art alternatives.
Computer Vision and Pattern Recognition,Graphics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in multi - view images, how to jointly optimize camera poses and neural fields to improve the quality of novel view synthesis (NVS) and 3D geometric reconstruction. Specifically, the paper introduces a new method - Camera RAY Matching (CRAYM), aiming to overcome the performance degradation problem caused by camera pose noise in existing methods. ### Problem Background In the field of multi - view 3D reconstruction, accurate camera poses are crucial for generating high - quality new views and 3D models. However, in practical applications, camera pose information may come from different devices (such as GPS or IMU), and this information may be noisy, thus affecting the reconstruction effect. Traditional multi - view stereo (MVS) methods and recent neural field methods (such as NeRF) all rely on accurate camera poses, but when the pose information is inaccurate, their performance will decline significantly. ### Solution The CRAYM method solves the above problems in the following ways: 1. **Camera Ray Matching**: Different from traditional pixel matching, CRAYM uses camera rays for matching. Each ray not only carries 2D pixel values but also contains 3D spatial information, which helps to impose explicit geometric constraints when optimizing camera poses. 2. **Feature Volume**: CRAYM parameterizes camera rays into a feature volume, which encodes both geometric and photometric information simultaneously. In this way, the constraints of ray matching can be directly transferred to the feature volume, thereby imposing physically meaningful constraints in the joint optimization process. 3. **Multi - view Consistency**: CRAYM utilizes multi - view consistency to improve the quality of geometric reconstruction and photometric rendering. Specifically, it ensures the consistency of color and geometry by matching rays between key points and provides context information through auxiliary rays to enhance the robustness of key rays. 4. **Loss Function Design**: CRAYM introduces two geometric losses (epipolar loss and point - alignment loss) to further promote the consistency of ray matching and improve the reconstruction quality. ### Summary By introducing the concepts of camera ray matching and feature volume, CRAYM effectively solves the impact of camera pose noise on multi - view 3D reconstruction, thus achieving better performance in novel view synthesis and 3D geometric reconstruction tasks. Especially when dealing with fine - grained details, CRAYM shows obvious advantages. ### Formula Representation To ensure the correctness and readability of the formulas, the following are some key formulas involved in the paper: - **Ray Equation**: \[ r(t)=r_{o}+t r_{d} \quad(t \geq 0) \] where \(r_{o}\) is the camera center and \(r_{d}\) is the normalized line - of - sight direction. - **Accumulated Ray Feature**: \[ f(r_{k})=\int_{0}^{\infty} T(p_{k}) \sigma(p_{k}) f^{\prime \prime}(p_{k}) d t \] where \(T(r_{k}(t))=\exp \left(-\int_{0}^{t} \sigma(s) d s\right)\) represents the accumulated transmittance along the key ray \(r_{k}\). - **Matching Ray Consistency Module**: \[ c(r_{k}) = w c(r'_{k})+(1 - w) c(r_{k}) \] where \(w\) is the matching credibility, calculated as the cosine distance between the accumulated features of matching rays. - **Total Loss Function**: \[ L=\lambda_{1} L_{p}+\lambda_{2} L_{s}+\lambda_{3} L_{e}+\lambda_{4} L_{a} \] where \(L_{p}\) is the photometric loss, \(L_{s}\) is the SSIM loss, \(L_{e}\) is the epipolar loss, and \(L_{a}\) is the point - alignment loss. Through these formulas, CRAYM can effectively optimize camera poses and neural fields, thereby achieving high quality.