Dynamic Multi-View Scene Reconstruction Using Neural Implicit Surface

Decai Chen,Haofei Lu,Ingo Feldmann,Oliver Schreer,Peter Eisert
2023-03-01
Abstract:Reconstructing general dynamic scenes is important for many computer vision and graphics applications. Recent works represent the dynamic scene with neural radiance fields for photorealistic view synthesis, while their surface geometry is under-constrained and noisy. Other works introduce surface constraints to the implicit neural representation to disentangle the ambiguity of geometry and appearance field for static scene reconstruction. To bridge the gap between rendering dynamic scenes and recovering static surface geometry, we propose a template-free method to reconstruct surface geometry and appearance using neural implicit representations from multi-view videos. We leverage topology-aware deformation and the signed distance field to learn complex dynamic surfaces via differentiable volume rendering without scene-specific prior knowledge like template models. Furthermore, we propose a novel mask-based ray selection strategy to significantly boost the optimization on challenging time-varying regions. Experiments on different multi-view video datasets demonstrate that our method achieves high-fidelity surface reconstruction as well as photorealistic novel view synthesis.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to reconstruct the high - fidelity surface geometry and appearance of dynamic scenes in multi - view videos. Specifically, when dealing with dynamic scenes, although existing methods can achieve photorealistic view synthesis, they perform poorly in geometric representation and are usually affected by the entanglement between geometric and appearance fields. In addition, many methods rely on prior knowledge of specific scenes (such as template models), which limits their generalization ability. Therefore, this paper proposes a template - free method to reconstruct the surface geometry and appearance of dynamic scenes from multi - view videos through neural implicit representation to overcome the above limitations. To achieve this goal, the paper proposes the following key technical points: 1. **Topology - aware deformation**: Use the SE(3) - field network to calculate the mapping of spatial points from the observation space to the canonical space, enabling the model to share the consistency of geometric and appearance information over time. 2. **Signed distance function (SDF)**: Use the SDF network to represent the geometric structure and calculate the surface normal through automatic differentiation, thereby better separating the geometric and appearance fields. 3. **Mask - based ray selection strategy**: Propose a new mask - based ray selection strategy, which significantly improves the optimization effect by assigning a higher sampling probability to the foreground regions that change over time. These techniques work together to enable the method proposed in this paper to achieve high - quality surface reconstruction and photorealistic new - view synthesis on different multi - view video datasets.