StableNormal: Reducing Diffusion Variance for Stable and Sharp Normal

Chongjie Ye,Lingteng Qiu,Xiaodong Gu,Qi Zuo,Yushuang Wu,Zilong Dong,Liefeng Bo,Yuliang Xiu,Xiaoguang Han

2024-06-25

Abstract:This work addresses the challenge of high-quality surface normal estimation from monocular colored inputs (i.e., images and videos), a field which has recently been revolutionized by repurposing diffusion priors. However, previous attempts still struggle with stochastic inference, conflicting with the deterministic nature of the Image2Normal task, and costly ensembling step, which slows down the estimation process. Our method, StableNormal, mitigates the stochasticity of the diffusion process by reducing inference variance, thus producing "Stable-and-Sharp" normal estimates without any additional ensembling process. StableNormal works robustly under challenging imaging conditions, such as extreme lighting, blurring, and low quality. It is also robust against transparent and reflective surfaces, as well as cluttered scenes with numerous objects. Specifically, StableNormal employs a coarse-to-fine strategy, which starts with a one-step normal estimator (YOSO) to derive an initial normal guess, that is relatively coarse but reliable, then followed by a semantic-guided refinement process (SG-DRN) that refines the normals to recover geometric details. The effectiveness of StableNormal is demonstrated through competitive performance in standard datasets such as DIODE-indoor, iBims, ScannetV2 and NYUv2, and also in various downstream tasks, such as surface reconstruction and normal enhancement. These results evidence that StableNormal retains both the "stability" and "sharpness" for accurate normal estimation. StableNormal represents a baby attempt to repurpose diffusion priors for deterministic estimation. To democratize this, code and models have been publicly available in <a class="link-external link-http" href="http://hf.co/Stable-X" rel="external noopener nofollow">this http URL</a>

Computer Vision and Pattern Recognition,Artificial Intelligence,Graphics

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve This paper aims to address the issue of high-quality surface normal estimation from monocular color inputs (i.e., images and videos). In recent years, this field has seen revolutionary progress through the reuse of diffusion priors. However, existing methods still face the following problems: 1. **Stochastic Inference**: Current diffusion-based methods have inherent randomness in the inference process, which conflicts with the deterministic requirements of the image-to-normal task. 2. **High Cost of Integration Steps**: To reduce randomness, existing methods often require integration steps, which significantly reduce estimation speed. 3. **Lack of Robustness**: Performance of existing methods is unstable under challenging imaging conditions such as extreme lighting, blur, and low-quality images. 4. **Handling Transparent and Reflective Surfaces**: Existing methods perform poorly when dealing with transparent and reflective surfaces. 5. **Handling Complex Scenes**: Existing methods are not effective in complex scenes containing numerous objects. To address these issues, the paper proposes the **StableNormal** method, which improves the stability and sharpness of normal estimation by reducing inference variance during the diffusion process. Specifically, StableNormal adopts a coarse-to-fine strategy, first generating an initial normal estimation through a one-step method, and then further refining the normals through a semantically guided refinement process, thereby excelling in various practical applications.

StableNormal: Reducing Diffusion Variance for Stable and Sharp Normal

Normal Estimation Via Shifted Neighborhood for Point Cloud

Geometry Guided Deep Surface Normal Estimation

D2NT: A High-Performing Depth-to-Normal Translator

Surface Normals in the Wild

Three-Filters-to-Normal+: Revisiting Discontinuity Discrimination in Depth-to-Normal Translation

Dense Photometric Stereo: A Markov Random Field Approach

Three-Filters-to-Normal$+$: Revisiting Discontinuity Discrimination in Depth-to-Normal Translation

Robust stereo matching with surface normal prediction.

Three-Filters-to-Normal: An Accurate and Ultrafast Surface Normal Estimator

SS-Norm: Spectral-spatial Normalization for Single-Domain Generalization with Application to Retinal Vessel Segmentation.

A Novel Method of Normal Estimation for Visualization of Medical Images

Refine-Net: Normal Refinement Neural Network for Noisy Point Clouds

NeuralGF: Unsupervised Point Normal Estimation by Learning Neural Gradient Function

On accurate recovery of 3D surface normal using minimum 2D images

Stealing Stable Diffusion Prior for Robust Monocular Depth Estimation

InfoNorm: Mutual Information Shaping of Normals for Sparse-View Reconstruction

Estimating High-resolution Surface Normals via Low-resolution Photometric Stereo Images

Normal-GS: 3D Gaussian Splatting with Normal-Involved Rendering

Adaptive Surface Normal Constraint for Geometric Estimation From Monocular Images

OCMG-Net: Neural Oriented Normal Refinement for Unstructured Point Clouds