Deformable 3D Gaussians for High-Fidelity Monocular Dynamic Scene Reconstruction

Ziyi Yang,Xinyu Gao,Wen Zhou,Shaohui Jiao,Yuqing Zhang,Xiaogang Jin
2023-11-19
Abstract:Implicit neural representation has paved the way for new approaches to dynamic scene reconstruction and rendering. Nonetheless, cutting-edge dynamic neural rendering methods rely heavily on these implicit representations, which frequently struggle to capture the intricate details of objects in the scene. Furthermore, implicit methods have difficulty achieving real-time rendering in general dynamic scenes, limiting their use in a variety of tasks. To address the issues, we propose a deformable 3D Gaussians Splatting method that reconstructs scenes using 3D Gaussians and learns them in canonical space with a deformation field to model monocular dynamic scenes. We also introduce an annealing smoothing training mechanism with no extra overhead, which can mitigate the impact of inaccurate poses on the smoothness of time interpolation tasks in real-world datasets. Through a differential Gaussian rasterizer, the deformable 3D Gaussians not only achieve higher rendering quality but also real-time rendering speed. Experiments show that our method outperforms existing methods significantly in terms of both rendering quality and speed, making it well-suited for tasks such as novel-view synthesis, time interpolation, and real-time rendering.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper primarily aims to address several key issues in dynamic scene reconstruction and rendering: 1. **Detail Capture Issue**: Existing neural rendering methods struggle to capture fine details of objects in dynamic scenes. 2. **Real-time Rendering Issue**: For general dynamic scenes, existing methods find it difficult to achieve true real-time rendering, limiting their practicality in various application scenarios. 3. **Efficient and High-Quality Rendering Issue**: Achieving both high rendering quality and high speed remains a challenge for dynamic scene modeling. To address the above issues, the paper proposes a method based on Deformable 3D Gaussians for constructing and rendering monocular dynamic scenes. This method is improved in the following aspects: - **Deformable 3D Gaussian Framework**: The paper proposes a new framework that uses 3D Gaussian distributions to represent the scene and learns these Gaussians in a canonical space through a deformation field to adapt to dynamic scene changes. This method not only achieves high-quality scene reconstruction but also reaches real-time rendering speeds. - **Smoothing Training Mechanism**: To mitigate the impact of inaccurate pose estimation in real datasets, the paper introduces a new mechanism called "Annealing Smoothing Training (AST)." This mechanism improves the model's temporal generalization ability without adding extra computational overhead and prevents over-smoothing, thereby preserving the details of objects in dynamic scenes. - **Extending 3D-GS to Dynamic Scenes**: This work is the first to extend the 3D-GS method to dynamic scenes by introducing a deformation field that allows 3D Gaussian distributions to be learned in a canonical space, enhancing the flexibility and accuracy of scene representation. Experimental results show that this method significantly outperforms existing methods in both rendering quality and speed, demonstrating excellent performance on both synthetic and real-world datasets, especially in handling complex details and temporal interpolation tasks.