Intrinsic Image Diffusion for Indoor Single-view Material Estimation

Peter Kocsis,Vincent Sitzmann,Matthias Nießner

2024-03-21

Abstract:We present Intrinsic Image Diffusion, a generative model for appearance decomposition of indoor scenes. Given a single input view, we sample multiple possible material explanations represented as albedo, roughness, and metallic maps. Appearance decomposition poses a considerable challenge in computer vision due to the inherent ambiguity between lighting and material properties and the lack of real datasets. To address this issue, we advocate for a probabilistic formulation, where instead of attempting to directly predict the true material properties, we employ a conditional generative model to sample from the solution space. Furthermore, we show that utilizing the strong learned prior of recent diffusion models trained on large-scale real-world images can be adapted to material estimation and highly improves the generalization to real images. Our method produces significantly sharper, more consistent, and more detailed materials, outperforming state-of-the-art methods by $1.5dB$ on PSNR and by $45\%$ better FID score on albedo prediction. We demonstrate the effectiveness of our approach through experiments on both synthetic and real-world datasets.

Computer Vision and Pattern Recognition,Artificial Intelligence,Graphics

What problem does this paper attempt to address?

The paper aims to address the problem of material estimation in single-view scenarios, particularly the appearance decomposition in indoor scenes. The inherent ambiguity between lighting and material properties makes this task very challenging. Specifically, the paper proposes a generative model based on a diffusion model—Intrinsic Image Diffusion—to predict multiple possible material interpretations (such as albedo, roughness, and metallic maps) from a single input image. By leveraging the powerful priors of the latest diffusion models, this method can generate more detailed and consistent material estimates and outperforms existing state-of-the-art methods in complex indoor scenes. The main contributions of the paper include: 1. Formalizing the appearance decomposition problem as a probabilistic problem and using a diffusion model to sample the solution space. 2. Utilizing the priors of a pre-trained diffusion model on real images for material estimation, achieving significant improvements in albedo prediction (FID improved by 77.6%, PSNR improved by 4.04 dB). 3. Optimizing lighting conditions using the material prediction results, supporting the optimization of multiple point light sources and global environment light maps. Through experimental validation, the method not only performs well on synthetic datasets but also shows good performance on real-world datasets, with richer and clearer generated material details. Additionally, the paper explores the impact of pre-trained models on performance improvement.

Intrinsic Image Diffusion for Indoor Single-view Material Estimation

IntrinsicAnything: Learning Diffusion Priors for Inverse Rendering Under Unknown Illumination

DiffMat: Latent diffusion models for image-guided material generation

MatFusion: A Generative Diffusion Model for SVBRDF Capture

Colorful Diffuse Intrinsic Image Decomposition in the Wild

RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and Generation

Mixed Diffusion for 3D Indoor Scene Synthesis

MaterialFusion: Enhancing Inverse Rendering with Material Diffusion Priors

Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering

Generating Material-Aware 3D Models from Sparse Views

Coherent 3D Scene Diffusion From a Single RGB Image

Material Anything: Generating Materials for Any 3D Object via Diffusion

RGB$\leftrightarrow$X: Image decomposition and synthesis using material- and lighting-aware diffusion models

Denoising Diffusion via Image-Based Rendering

InsertDiffusion: Identity Preserving Visualization of Objects through a Training-Free Diffusion Architecture

Diffusion-based image inpainting with internal learning

DiffuScene: Denoising Diffusion Models for Generative Indoor Scene Synthesis

Joint Material and Illumination Estimation from Photo Sets in the Wild

RoomDiffusion: A Specialized Diffusion Model in the Interior Design Industry

Retinex-Diffusion: On Controlling Illumination Conditions in Diffusion Models via Retinex Theory