AGG: Amortized Generative 3D Gaussians for Single Image to 3D

Dejia Xu,Ye Yuan,Morteza Mardani,Sifei Liu,Jiaming Song,Zhangyang Wang,Arash Vahdat
2024-01-09
Abstract:Given the growing need for automatic 3D content creation pipelines, various 3D representations have been studied to generate 3D objects from a single image. Due to its superior rendering efficiency, 3D Gaussian splatting-based models have recently excelled in both 3D reconstruction and generation. 3D Gaussian splatting approaches for image to 3D generation are often optimization-based, requiring many computationally expensive score-distillation steps. To overcome these challenges, we introduce an Amortized Generative 3D Gaussian framework (AGG) that instantly produces 3D Gaussians from a single image, eliminating the need for per-instance optimization. Utilizing an intermediate hybrid representation, AGG decomposes the generation of 3D Gaussian locations and other appearance attributes for joint optimization. Moreover, we propose a cascaded pipeline that first generates a coarse representation of the 3D data and later upsamples it with a 3D Gaussian super-resolution module. Our method is evaluated against existing optimization-based 3D Gaussian frameworks and sampling-based pipelines utilizing other 3D representations, where AGG showcases competitive generation abilities both qualitatively and quantitatively while being several orders of magnitude faster. Project page:
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of generating 3D content from a single image, especially how to efficiently generate 3D Gaussian models (3D Gaussians). Specifically, the paper focuses on: 1. **Reducing optimization steps**: Existing 3D - Gaussian - based generation methods usually rely on an instance - by - instance optimization process, which requires a large amount of computing resources and time. The author hopes that by introducing an Amortized Generative 3D Gaussian framework (AGG), it is possible to quickly generate 3D Gaussian models directly from a single image without instance - by - instance optimization. 2. **Increasing generation speed**: Existing optimization - based methods can generate high - quality 3D models, but their generation speed is slow. The goal of the AGG framework is to significantly increase the generation speed while maintaining the generation quality, achieving real - time generation. 3. **Simplifying the 3D content creation process**: With the development of virtual reality and augmented reality technologies, the demand for automated 3D content creation is increasing. By proposing an efficient single - image - to - 3D - model generation method, the AGG framework hopes to enable non - professional users to easily create high - quality 3D content as well. ### Specific contributions of the paper - **Amortized Generative Framework (AGG)**: A new amortized generative framework is proposed, which can directly generate 3D Gaussian models from a single image in one forward propagation, avoiding the instance - by - instance optimization process. - **Hierarchical generation pipeline**: The AGG framework adopts a hierarchical generation pipeline. First, a rough 3D Gaussian representation is generated, and then the generated 3D Gaussian model is further refined through a super - resolution module. This design makes the generation process more stable and can handle complex geometric structures. - **Mixed representation**: In the rough generation stage, the AGG framework uses a hybrid generator to process geometric information and texture information separately, thereby improving the stability and efficiency of generation. - **Efficient super - resolution module**: The super - resolution module in the second stage uses point - voxel convolutional networks to extract local features and further refine the details of the 3D Gaussian model. ### Summary The main goal of this paper is to achieve fast and efficient generation of 3D Gaussian models from a single image by proposing an Amortized Generative Framework (AGG), thereby simplifying the 3D content creation process and increasing the generation speed. Compared with existing optimization - based methods, the AGG framework not only has a significant improvement in generation speed, but also performs well in terms of generation quality and stability.