Abstract:Text-to-3D scene generation holds immense potential for the gaming, film, and architecture sectors. Despite significant progress, existing methods struggle with maintaining high quality, consistency, and editing flexibility. In this paper, we propose DreamScene, a 3D Gaussian-based novel text-to-3D scene generation framework, to tackle the aforementioned three challenges mainly via two strategies. First, DreamScene employs Formation Pattern Sampling (FPS), a multi-timestep sampling strategy guided by the formation patterns of 3D objects, to form fast, semantically rich, and high-quality representations. FPS uses 3D Gaussian filtering for optimization stability, and leverages reconstruction techniques to generate plausible textures. Second, DreamScene employs a progressive three-stage camera sampling strategy, specifically designed for both indoor and outdoor settings, to effectively ensure object-environment integration and scene-wide 3D consistency. Last, DreamScene enhances scene editing flexibility by integrating objects and environments, enabling targeted adjustments. Extensive experiments validate DreamScene's superiority over current state-of-the-art techniques, heralding its wide-ranging potential for diverse applications. Code and demos will be released at <a class="link-external link-https" href="https://dreamscene-project.github.io" rel="external noopener nofollow">this https URL</a> .

What problem does this paper attempt to address?

The problems that this paper attempts to solve mainly focus on several key challenges in current text - to - 3D scene generation methods: 1. **Inefficient generation process**: Existing methods often lead to low - quality generation and long completion times. 2. **Inconsistent 3D visual cues**: The generated results perform well at specific camera positions, but the 3D consistency of the overall scene is poor. 3. **Difficulty in separating objects from the environment**: It is unable to effectively separate objects from the environment, limiting the flexible editing of individual elements. To address these challenges, the paper proposes **DreamScene**, a novel text - to - 3D scene generation framework based on 3D Gaussians. DreamScene mainly solves the above problems through the following two strategies: 1. **Formation Pattern Sampling (FPS)**: - FPS is a multi - time - step sampling strategy, guided by the formation patterns of 3D objects, and can quickly generate semantically rich and high - quality representations. - FPS uses 3D Gaussian filtering to optimize stability and utilizes reconstruction techniques to generate realistic textures. 2. **Progressive three - stage camera sampling strategy**: - This strategy is specifically designed for indoor and outdoor settings and effectively ensures the integration of objects and the environment as well as the 3D consistency of the entire scene. - Finally, by integrating objects and the environment, DreamScene enhances the flexibility of scene editing, allowing for target adjustment. The paper verifies the superiority of DreamScene through extensive experiments, indicating its broad application potential in generating high - quality, consistent, and editable 3D scenes. The code and demonstration have been published at [https://dreamscene - project.github.io](https://dreamscene - project.github.io). ### Main contributions - **Proposing DreamScene**: A novel text - driven 3D scene generation framework that efficiently generates high - quality, scene - level consistent, and editable 3D scenes through formation pattern sampling, strategic camera sampling, and seamless object - environment integration. - **Formation Pattern Sampling (FPS)**: Combining multi - time - step sampling, 3D Gaussian filtering, and reconstruction generation, it can generate high - quality, semantically rich 3D representations within 30 minutes. - **Qualitative and quantitative experiments**: Demonstrate that DreamScene outperforms existing methods in text - driven 3D object and scene generation, showing its great potential in multiple fields such as games and movies.

DreamScene: 3D Gaussian-based Text-to-3D Scene Generation via Formation Pattern Sampling

DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting

DreamScape: 3D Scene Creation via Gaussian Splatting joint Correlation Modeling

SceneDreamer360: Text-Driven 3D-Consistent Scene Generation with Panoramic Gaussian Splatting

RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion

DreamSpace: Dreaming Your Room Space with Text-Driven Panoramic Texture Propagation

3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation

FastScene: Text-Driven Fast 3D Indoor Scene Generation via Panoramic Gaussian Splatting

SceneDreamer: Unbounded 3D Scene Generation from 2D Image Collections

BrightDreamer: Generic 3D Gaussian Generative Framework for Fast Text-to-3D Synthesis

DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos

HoloDreamer: Holistic 3D Panoramic World Generation from Text Descriptions

GaussianDreamerPro: Text to Manipulable 3D Gaussians with Highly Enhanced Quality

RoomDreamer: Text-Driven 3D Indoor Scene Synthesis with Coherent Geometry and Texture

3DSceneEditor: Controllable 3D Scene Editing with Gaussian Splatting

GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models

GaussianDreamer: Fast Generation from Text to 3D Gaussian Splatting with Point Cloud Priors

Text2Immersion: Generative Immersive Scene with 3D Gaussians

SceneCraft: Layout-Guided 3D Scene Generation

DreamMapping: High-Fidelity Text-to-3D Generation via Variational Distribution Mapping

Director3D: Real-world Camera Trajectory and 3D Scene Generation from Text