Abstract:Texture, highlights, and shading are some of many visual cues that allow humans to perceive material appearance in single pictures. Yet, recovering spatially-varying bi-directional reflectance distribution functions (SVBRDFs) from a single image based on such cues has challenged researchers in computer graphics for decades. We tackle lightweight appearance capture by training a deep neural network to automatically extract and make sense of these visual cues. Once trained, our network is capable of recovering per-pixel normal, diffuse albedo, specular albedo and specular roughness from a single picture of a flat surface lit by a hand-held flash. We achieve this goal by introducing several innovations on training data acquisition and network design. For training, we leverage a large dataset of artist-created, procedural SVBRDFs which we sample and render under multiple lighting directions. We further amplify the data by material mixing to cover a wide diversity of shading effects, which allows our network to work across many material classes. Motivated by the observation that distant regions of a material sample often offer complementary visual cues, we design a network that combines an encoder-decoder convolutional track for local feature extraction with a fully-connected track for global feature extraction and propagation. Many important material effects are view-dependent, and as such ambiguous when observed in a single image. We tackle this challenge by defining the loss as a differentiable SVBRDF similarity metric that compares the renderings of the predicted maps against renderings of the ground truth from several lighting and viewing directions. Combined together, these novel ingredients bring clear improvement over state of the art methods for single-shot capture of spatially varying BRDFs.

Deep inverse rendering for high-resolution SVBRDF estimation from an arbitrary number of images

Inverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF from a Single Image

Deep Inverse Rendering for Practical Object Appearance Scan with Uncalibrated Illumination

Ultra-High Resolution SVBRDF Recovery from a Single Image

MAIR: Multi-view Attention Inverse Rendering with 3D Spatially-Varying Lighting Estimation

Single-image SVBRDF capture with a rendering-aware deep network

Deep 3D Capture: Geometry and Reflectance from Sparse Multi-View Images

Photometric Inverse Rendering: Shading Cues Modeling and Surface Reflectance Regularization

Flexible SVBRDF Capture with a Multi‐Image Deep Network

Invertible Neural BRDF for Object Inverse Rendering

Differentiable Inverse Rendering with Interpretable Basis BRDFs

DeepBRDF: A Deep Representation for Manipulating Measured BRDF.

SIRe-IR: Inverse Rendering for BRDF Reconstruction with Shadow and Illumination Removal in High-Illuminance Scenes

Face Inverse Rendering via Hierarchical Decoupling

Neural Inverse Rendering of an Indoor Scene from a Single Image

PhyIR: Physics-based Inverse Rendering for Panoramic Indoor Images

IntrinsicAnything: Learning Diffusion Priors for Inverse Rendering Under Unknown Illumination

Multi-view Inverse Rendering for Large-scale Real-world Indoor Scenes

MAIR++: Improving Multi-view Attention Inverse Rendering with Implicit Lighting Representation

Revisiting Deep Intrinsic Image Decompositions.

Revisiting Deep Image Smoothing and Intrinsic Image Decomposition.