Self-Ensembling Gaussian Splatting for Few-shot Novel View Synthesis

Chen Zhao,Xuan Wang,Tong Zhang,Saqib Javed,Mathieu Salzmann
2024-11-01
Abstract:3D Gaussian Splatting (3DGS) has demonstrated remarkable effectiveness for novel view synthesis (NVS). However, the 3DGS model tends to overfit when trained with sparse posed views, limiting its generalization capacity for broader pose variations. In this paper, we alleviate the overfitting problem by introducing a self-ensembling Gaussian Splatting (SE-GS) approach. We present two Gaussian Splatting models named the $\mathbf{\Sigma}$-model and the $\mathbf{\Delta}$-model. The $\mathbf{\Sigma}$-model serves as the primary model that generates novel-view images during inference. At the training stage, the $\mathbf{\Sigma}$-model is guided away from specific local optima by an uncertainty-aware perturbing strategy. We dynamically perturb the $\mathbf{\Delta}$-model based on the uncertainties of novel-view renderings across different training steps, resulting in diverse temporal models sampled from the Gaussian parameter space without additional training costs. The geometry of the $\mathbf{\Sigma}$-model is regularized by penalizing discrepancies between the $\mathbf{\Sigma}$-model and the temporal samples. Therefore, our SE-GS conducts an effective and efficient regularization across a large number of Gaussian Splatting models, resulting in a robust ensemble, the $\mathbf{\Sigma}$-model. Experimental results on the LLFF, Mip-NeRF360, DTU, and MVImgNet datasets show that our approach improves NVS quality with few-shot training views, outperforming existing state-of-the-art methods. The code is released at <a class="link-external link-https" href="https://github.com/sailor-z/SE-GS" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Graphics
What problem does this paper attempt to address?
### Problems Addressed by the Paper This paper aims to address the issue of overfitting in the 3D Gaussian Splatting (3DGS) model when performing Novel View Synthesis (NVS) with few-shot images. Specifically, when the viewpoints in the training data are sparse, the 3DGS model tends to fall into local optima, resulting in poor generalization to new viewpoints. To solve this problem, the authors introduce a Self-Ensembling Gaussian Splatting (SE-GS) method. ### Main Contributions 1. **Self-Ensembling Mechanism**: By introducing a self-ensembling mechanism, the SE-GS method can generate diverse temporary samples during training, effectively avoiding the overfitting problem. 2. **Uncertainty-Aware Perturbation**: SE-GS dynamically perturbs model parameters based on the uncertainty of rendered images to generate diverse temporary samples without additional training costs. 3. **Efficient Regularization**: SE-GS minimizes the differences between the main model (Σ-model) and the temporary samples (∆-model) through regularization techniques, thereby improving the model's robustness and generalization ability. 4. **Experimental Validation**: Experiments were conducted on multiple datasets, including LLFF, Mip-NeRF360, DTU, and MVImgNet. The results show that SE-GS significantly improves the quality of novel view synthesis with few-shot images, surpassing existing state-of-the-art methods. ### Solution 1. **Model Architecture**: - **Σ-model**: Serves as the main model used to generate new viewpoint images during inference. - **∆-model**: Generates diverse temporary samples through uncertainty-aware perturbation during training. 2. **Uncertainty-Aware Perturbation**: - Sample pseudo-viewpoints from the camera trajectory of the training viewpoints, render images, and store them in a buffer. - Calculate pixel-level uncertainty of the images in the buffer to identify pixels with high uncertainty. - Randomly perturb the Gaussian parameters corresponding to these pixels to generate diverse temporary samples. 3. **Regularization**: - Minimize the differences between the main model and the temporary samples through photometric loss for effective regularization. - Use a co-pruning strategy to further enhance the regularization effect. ### Experimental Results - **Quantitative Results**: Experiments on the LLFF, DTU, Mip-NeRF360, and MVImgNet datasets show that SE-GS significantly improves the quality of novel view synthesis with few-shot images, especially excelling in metrics such as PSNR, SSIM, LPIPS, and A VGE. - **Qualitative Results**: Compared to 3DGS and CoR-GS, the new viewpoint images generated by SE-GS have fewer visual artifacts and capture finer details in complex texture regions. ### Conclusion By introducing a self-ensembling mechanism and uncertainty-aware perturbation, this paper effectively addresses the overfitting issue of the 3D Gaussian Splatting model with few-shot images, significantly improving the quality and robustness of novel view synthesis. Experimental results demonstrate that SE-GS performs excellently across multiple datasets, surpassing existing state-of-the-art methods.