Haian Jin,Isabella Liu,Peijia Xu,Xiaoshuai Zhang,Songfang Han,Sai Bi,Xiaowei Zhou,Zexiang Xu,Hao Su
Abstract:We propose TensoIR, a novel inverse rendering approach based on tensor factorization and neural fields. Unlike previous works that use purely MLP-based neural fields, thus suffering from low capacity and high computation costs, we extend TensoRF, a state-of-the-art approach for radiance field modeling, to estimate scene geometry, surface reflectance, and environment illumination from multi-view images captured under unknown lighting conditions. Our approach jointly achieves radiance field reconstruction and physically-based model estimation, leading to photo-realistic novel view synthesis and relighting results. Benefiting from the efficiency and extensibility of the TensoRF-based representation, our method can accurately model secondary shading effects (like shadows and indirect lighting) and generally support input images captured under single or multiple unknown lighting conditions. The low-rank tensor representation allows us to not only achieve fast and compact reconstruction but also better exploit shared information under an arbitrary number of capturing lighting conditions. We demonstrate the superiority of our method to baseline methods qualitatively and quantitatively on various challenging synthetic and real-world scenes.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the inverse rendering problem in computer vision and graphics. Specifically, the author proposes a new inverse rendering method based on tensor decomposition and neural fields - TensoIR, aiming to estimate the geometric structure, surface reflectance and ambient illumination of the scene from multi - view images, which are captured under unknown illumination conditions.
### Main problems
1. **Challenges of inverse rendering**:
- Inverse rendering is a long - standing difficult problem, especially when the input images are captured under unknown illumination conditions in the wild. This problem is essentially ill - posed, that is, there are multiple possible solutions.
- Existing MLP - based methods (such as NeRF) are usually of limited capacity and high computational cost, which limit the accuracy and efficiency of inverse rendering.
2. **Limitations of existing methods**:
- Pure MLP methods are too computationally expensive when dealing with secondary shadow effects (such as shadows and indirect illumination), causing many methods to either ignore these effects or perform approximate calculations through additional MLP networks, which require expensive pre - calculations and reduce accuracy.
- Although multi - light - source capture is beneficial for inverse rendering, traditional methods are extremely computationally expensive when dealing with multiple unknown illumination conditions.
### Solutions
The author proposes a new framework TensoIR based on tensor decomposition, and its main features include:
1. **Efficient and accurate inverse rendering**:
- Based on TensoRF (an efficient tensor decomposition representation), this method can simultaneously estimate the scene geometry, material properties and illumination conditions.
- The tensor decomposition representation enables the model to quickly and compactly reconstruct the scene and better utilize the shared information under any number of captured illumination conditions.
2. **Online calculation of secondary shadow effects**:
- Using low - rank tensor representation, TensoIR can calculate ray integrals online during the training process, thereby achieving accurate visibility and indirect illumination calculations.
- This enables the model to more accurately simulate secondary shadow effects, such as shadows and indirect illumination, during the reconstruction process.
3. **Multi - light - source support**:
- By adding an additional illumination dimension in the tensor decomposition representation, TensoIR can effectively handle multi - light - source capture, use additional photometric cues to reduce the ambiguity in material estimation, and thus improve the reconstruction quality.
4. **Joint optimization framework**:
- TensoIR simultaneously estimates all scene components (geometry, material, illumination) through a joint optimization framework and is optimized under the supervision of rendering loss and regularization terms.
### Summary
The main contribution of TensoIR lies in providing an efficient and accurate inverse rendering solution that can handle multi - view images under unknown illumination conditions, and achieve high - quality geometric and material reconstruction in complex real - scene, as well as realistic new - view synthesis and relighting effects.