Abstract:Owing to the rapid development of emerging $360^{\circ }$ panoramic imaging techniques, indoor $360^{\circ }$ depth estimation has aroused extensive attention in the community. Due to the lack of available ground truth depth data, it is extremely urgent to model indoor $360^{\circ }$ depth estimation in self-supervised mode. However, self-supervised $360^{\circ }$ depth estimation suffers from two major limitations. One is the distortion and network training problems caused by Equirectangular projection (ERP), and the other is that texture-less regions are quite difficult to back-propagate in self-supervised mode. Hence, to address the above issues, we introduce spherical view synthesis for learning self-supervised $360^{\circ }$ depth estimation. Specifically, to alleviate the ERP-related problems, we first propose a dual-branch distortion-aware network to produce the coarse depth map, including a distortion-aware module and a hybrid projection fusion module. Subsequently, the coarse depth map is utilized for spherical view synthesis, in which a spherically weighted loss function for view reconstruction and depth smoothing is investigated to optimize the projection distribution problem of $360^{\circ }$ images. In addition, two structural regularities of indoor $360^{\circ }$ scenes are devised as two additional supervisory signals to efficiently optimize our self-supervised $360^{\circ }$ depth estimation model, containing the principal-direction normal constraint and the co-planar depth constraint. The principal-direction normal constraint is designed to align the normal of the $360^{\circ }$ image with the direction of the vanishing points. Meanwhile, we employ the co-planar depth constraint to fit the estimated depth of each pixel through its 3D plane. Finally, a depth map is obtained for the $360^{\circ }$ image. Experimental results illustrate that our proposed method achieves superior performance than the current advanced depth estimation methods on four publicly available datasets.

Distortion-Aware Self-Supervised Indoor 360$^{\circ }$ Depth Estimation Via Hybrid Projection Fusion and Structural Regularities

Self-supervised Indoor 360-Degree Depth Estimation via Structural Regularization

Distortion-Aware Monocular Depth Estimation for Omnidirectional Images

OLANET: Self-Supervised 360° Depth Estimation with Effective Distortion-Aware View Synthesis and L1 Smooth Regularization

Self-Supervised Dense Depth Estimation with Panoramic Image and Sparse Lidar

Robust and Flexible Omnidirectional Depth Estimation with Multiple 360° Cameras

CRF360D: Monocular 360 Depth Estimation via Spherical Fully-Connected CRFs

Depth Anywhere: Enhancing 360 Monocular Depth Estimation via Perspective Distillation and Unlabeled Data Augmentation

360Recon: An Accurate Reconstruction Method Based on Depth Fusion from 360 Images

High-Quality Depth Recovery Via Interactive Multi-view Stereo

StructDepth: Leveraging the Structural Regularities for Self-Supervised Indoor Depth Estimation

Revisiting 360 Depth Estimation with PanoGabor: A New Fusion Perspective

Hybrid-MVS: Robust Multi-View Reconstruction with Hybrid Optimization of Visual and Depth Cues

Depth360: Self-supervised Learning for Monocular Depth Estimation using Learnable Camera Distortion Model

Distortion-Tolerant Monocular Depth Estimation On Omnidirectional Images Using Dual-cubemap

OmniFusion: 360 Monocular Depth Estimation via Geometry-Aware Fusion

Using Full-Scale Feature Fusion for Self-Supervised Indoor Depth Estimation

Self-supervised 360$^{\circ}$ Room Layout Estimation

Complementary Bi-directional Feature Compression for Indoor 360° Semantic Segmentation with Self-distillation

SGFormer: Spherical Geometry Transformer for 360 Depth Estimation

Self-Supervised Monocular Depth Estimation with Self-Reference Distillation and Disparity Offset Refinement