Abstract:Recent text-guided generation of individual 3D object has achieved great success using diffusion priors. However, these methods are not suitable for object insertion and replacement tasks as they do not consider the background, leading to illumination mismatches within the environment. To bridge the gap, we introduce an illumination-aware 3D scene editing pipeline for 3D Gaussian Splatting (3DGS) representation. Our key observation is that inpainting by the state-of-the-art conditional 2D diffusion model is consistent with background in lighting. To leverage the prior knowledge from the well-trained diffusion models for 3D object generation, our approach employs a coarse-to-fine objection optimization pipeline with inpainted views. In the first coarse step, we achieve image-to-3D lifting given an ideal inpainted view. The process employs 3D-aware diffusion prior from a view-conditioned diffusion model, which preserves illumination present in the conditioning image. To acquire an ideal inpainted image, we introduce an Anchor View Proposal (AVP) algorithm to find a single view that best represents the scene illumination in target region. In the second Texture Enhancement step, we introduce a novel Depth-guided Inpainting Score Distillation Sampling (DI-SDS), which enhances geometry and texture details with the inpainting diffusion prior, beyond the scope of the 3D-aware diffusion prior knowledge in the first coarse step. DI-SDS not only provides fine-grained texture enhancement, but also urges optimization to respect scene lighting. Our approach efficiently achieves local editing with global illumination consistency without explicitly modeling light transport. We demonstrate robustness of our method by evaluating editing in real scenes containing explicit highlight and shadows, and compare against the state-of-the-art text-to-3D editing methods.

Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting

DisCoScene: Spatially Disentangled Generative Radiance Fields for Controllable 3D-Aware Scene Synthesis

Learning 3 D Scene Synthesis from Annotated RGB-D Images

RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion

Inpaint3D: 3D Scene Content Generation using 2D Inpainting Diffusion

Architect: Generating Vivid and Interactive 3D Scenes with Hierarchical 2D Inpainting

NeRFiller: Completing Scenes via Generative 3D Inpainting

PaintScene4D: Consistent 4D Scene Generation from Text Prompts

From 2D to 3D: Re-thinking Benchmarking of Monocular Depth Prediction

3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation

Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion

ART3D: 3D Gaussian Splatting for Text-Guided Artistic Scenes Generation

Automatic Scene Inference for 3D Object Compositing

Enhanced 3D Generation by 2D Editing

Enhancing Zero-shot 3D Photography Via Mesh-represented Image Inpainting

Aug3D-RPN: Improving Monocular 3D Object Detection by Synthetic Images with Virtual Depth

3inGAN: Learning a 3D Generative Model from Images of a Self-similar Scene

Localized Gaussian Splatting Editing with Contextual Awareness

Wonderland: Navigating 3D Scenes from a Single Image

InFusion: Inpainting 3D Gaussians via Learning Depth Completion from Diffusion Prior

RGBD2: Generative Scene Synthesis via Incremental View Inpainting using RGBD Diffusion Models