4D Scaffold Gaussian Splatting for Memory Efficient Dynamic Scene Reconstruction

Woong Oh Cho,In Cho,Seoha Kim,Jeongmin Bae,Youngjung Uh,Seon Joo Kim
2024-11-26
Abstract:Existing 4D Gaussian methods for dynamic scene reconstruction offer high visual fidelity and fast rendering. However, these methods suffer from excessive memory and storage demands, which limits their practical deployment. This paper proposes a 4D anchor-based framework that retains visual quality and rendering speed of 4D Gaussians while significantly reducing storage costs. Our method extends 3D scaffolding to 4D space, and leverages sparse 4D grid-aligned anchors with compressed feature vectors. Each anchor models a set of neural 4D Gaussians, each of which represent a local spatiotemporal region. In addition, we introduce a temporal coverage-aware anchor growing strategy to effectively assign additional anchors to under-reconstructed dynamic regions. Our method adjusts the accumulated gradients based on Gaussians' temporal coverage, improving reconstruction quality in dynamic regions. To reduce the number of anchors, we further present enhanced formulations of neural 4D Gaussians. These include the neural velocity, and the temporal opacity derived from a generalized Gaussian distribution. Experimental results demonstrate that our method achieves state-of-the-art visual quality and 97.8% storage reduction over 4DGS.
Computer Vision and Pattern Recognition,Graphics
What problem does this paper attempt to address?
This paper attempts to solve the problem of excessive storage cost in the 4D Gaussian method in dynamic scene reconstruction. Although the existing 4D Gaussian method performs well in visual fidelity and rendering speed, its huge memory and storage requirements limit practical applications. For this reason, the paper proposes a 4D - anchor - based framework, which significantly reduces storage costs while retaining the visual quality and rendering speed of 4D Gaussians. ### Specific Problems and Solutions 1. **Problems**: - **High Storage Cost**: The existing 4D Gaussian method requires a large amount of storage space. Especially when dealing with long videos, the storage requirement may exceed 6GB. - **Reconstruction Quality in Dynamic Areas**: Existing methods are prone to insufficient reconstruction when dealing with dynamic areas. 2. **Solutions**: - **4D Anchor Framework**: By introducing sparse 4D grid - aligned anchors and using compressed feature vectors to represent each anchor. Each anchor models a group of neural 4D Gaussians, and each Gaussian represents a local spatio - temporal region. - **Time - Coverage - Aware Anchor Growth Strategy**: By adjusting the cumulative gradient, new anchors are allocated according to the time coverage of Gaussians, especially in dynamic areas. - **Enhanced Neural 4D Gaussians**: Introduce neural speed and time opacity based on the generalized Gaussian distribution to better capture the changes of actual scene elements and reduce the number of required anchors. ### Method Overview 1. **Initializing 4D Anchors**: - Use the static point cloud of multi - view frames to initialize the position and feature vectors of 4D anchors. - Each anchor has a unique 4D spatio - temporal position \( p\in\mathbb{R}^4 \) and a feature vector \( f\in\mathbb{R}^C \). 2. **Generating Neural 4D Gaussians**: - Use a shared multi - layer perceptron (MLP) to generate multiple neural 4D Gaussians from anchor features. - The attributes of Gaussians include time - invariant opacity \( \alpha\in\mathbb{R} \), quaternion \( q\in\mathbb{R}^4 \), scaling factor \( s\in\mathbb{R}^3 \), view - dependent color \( c\in\mathbb{R}^3 \), time scale \( \sigma\in\mathbb{R} \) and neural speed \( v\in\mathbb{R}^3 \). 3. **Rendering Neural Gaussians**: - Calculate 3D Gaussian parameters at a specific time \( t \), including opacity \( \alpha_k(t) \) and center position \( \mu_k(t) \): \[ \mu_k(t)=\mu_{k,1:3}+h(t,\mu_{k,4},v) \] \[ \alpha_k(t)=\alpha_k\cdot g(t,\mu_{k,4},\sigma_k) \] - Render using an efficient 3D Gaussian rasterization pipeline. 4. **Time - Coverage - Aware Anchor Growth**: - Calculate the cumulative gradient \( \nabla g \) of each 4D Gaussian and adjust the weights according to the time coverage of Gaussians: \[ \nabla g = \frac{\sum_{i = 1}^N w(t,\sigma)\|\nabla 2D\|}{\sum_{i = 1}^N w(t,\sigma)} \] \[ w(t,\sigma)=\alpha(t)\left(\frac{1}{\sigma}\right)^\gamma \] - Place new anchors at the centers of voxels where the gradient is higher than the threshold. 5. **Enhanced 4D Gaussian Modeling**: - **Neural Speed**: Model local motion through linear motion: \[