Abstract:We propose TensoIR, a novel inverse rendering approach based on tensor factorization and neural fields. Unlike previous works that use purely MLP-based neural fields, thus suffering from low capacity and high computation costs, we extend TensoRF, a state-of-the-art approach for radiance field modeling, to estimate scene geometry, surface reflectance, and environment illumination from multi-view images captured under unknown lighting conditions. Our approach jointly achieves radiance field reconstruction and physically-based model estimation, leading to photo-realistic novel view synthesis and relighting results. Benefiting from the efficiency and extensibility of the TensoRF-based representation, our method can accurately model secondary shading effects (like shadows and indirect lighting) and generally support input images captured under single or multiple unknown lighting conditions. The low-rank tensor representation allows us to not only achieve fast and compact reconstruction but also better exploit shared information under an arbitrary number of capturing lighting conditions. We demonstrate the superiority of our method to baseline methods qualitatively and quantitatively on various challenging synthetic and real-world scenes.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the inverse rendering problem in computer vision and graphics. Specifically, the author proposes a new inverse rendering method based on tensor decomposition and neural fields - TensoIR, aiming to estimate the geometric structure, surface reflectance and ambient illumination of the scene from multi - view images, which are captured under unknown illumination conditions. ### Main problems 1. **Challenges of inverse rendering**: - Inverse rendering is a long - standing difficult problem, especially when the input images are captured under unknown illumination conditions in the wild. This problem is essentially ill - posed, that is, there are multiple possible solutions. - Existing MLP - based methods (such as NeRF) are usually of limited capacity and high computational cost, which limit the accuracy and efficiency of inverse rendering. 2. **Limitations of existing methods**: - Pure MLP methods are too computationally expensive when dealing with secondary shadow effects (such as shadows and indirect illumination), causing many methods to either ignore these effects or perform approximate calculations through additional MLP networks, which require expensive pre - calculations and reduce accuracy. - Although multi - light - source capture is beneficial for inverse rendering, traditional methods are extremely computationally expensive when dealing with multiple unknown illumination conditions. ### Solutions The author proposes a new framework TensoIR based on tensor decomposition, and its main features include: 1. **Efficient and accurate inverse rendering**: - Based on TensoRF (an efficient tensor decomposition representation), this method can simultaneously estimate the scene geometry, material properties and illumination conditions. - The tensor decomposition representation enables the model to quickly and compactly reconstruct the scene and better utilize the shared information under any number of captured illumination conditions. 2. **Online calculation of secondary shadow effects**: - Using low - rank tensor representation, TensoIR can calculate ray integrals online during the training process, thereby achieving accurate visibility and indirect illumination calculations. - This enables the model to more accurately simulate secondary shadow effects, such as shadows and indirect illumination, during the reconstruction process. 3. **Multi - light - source support**: - By adding an additional illumination dimension in the tensor decomposition representation, TensoIR can effectively handle multi - light - source capture, use additional photometric cues to reduce the ambiguity in material estimation, and thus improve the reconstruction quality. 4. **Joint optimization framework**: - TensoIR simultaneously estimates all scene components (geometry, material, illumination) through a joint optimization framework and is optimized under the supervision of rendering loss and regularization terms. ### Summary The main contribution of TensoIR lies in providing an efficient and accurate inverse rendering solution that can handle multi - view images under unknown illumination conditions, and achieve high - quality geometric and material reconstruction in complex real - scene, as well as realistic new - view synthesis and relighting effects.

TensoIR: Tensorial Inverse Rendering

TensoRF: Tensorial Radiance Fields

Joint Optimization of Triangle Mesh, Material, and Light from Neural Fields with Neural Radiance Cache

IRCasTRF: Inverse Rendering by Optimizing Cascaded Tensorial Radiance Fields, Lighting, and Materials From Multi-view Images

IRISformer: Dense Vision Transformers for Single-Image Inverse Rendering in Indoor Scenes

SIRe-IR: Inverse Rendering for BRDF Reconstruction with Shadow and Illumination Removal in High-Illuminance Scenes

Multi-view Inverse Rendering for Large-scale Real-world Indoor Scenes

IRIS: Inverse Rendering of Indoor Scenes from Low Dynamic Range Images

Neural Fields meet Explicit Geometric Representation for Inverse Rendering of Urban Scenes

TensoSDF: Roughness-aware Tensorial Representation for Robust Geometry and Material Reconstruction

VMINer: Versatile Multi-view Inverse Rendering with Near- and Far-field Light Sources

PhyIR: Physics-based Inverse Rendering for Panoramic Indoor Images

Physics-based Indirect Illumination for Inverse Rendering

MAIR: Multi-view Attention Inverse Rendering with 3D Spatially-Varying Lighting Estimation

Neural Fields Meet Explicit Geometric Representations for Inverse Rendering of Urban Scenes

Modeling Indirect Illumination for Inverse Rendering

NeISF: Neural Incident Stokes Field for Geometry and Material Estimation

Inverse Rendering of Translucent Objects using Physical and Neural Renderers

ReN Human: Learning Relightable Neural Implicit Surfaces for Animatable Human Rendering

NePF: Neural Photon Field for Single-Stage Inverse Rendering