Abstract:Depth completion is a crucial task in autonomous driving, aiming to convert a sparse depth map into a dense depth prediction. The sparse depth map serves as a partial reference for the actual depth, and the fusion of RGB images is frequently employed to augment the completion process owing to its inherent richness in semantic information. Image-guided depth completion confronts three principal challenges: (1) the effective fusion of the two modalities; (2) the enhancement of depth information recovery; and (3) the realization of real-time predictive capabilities requisite for practical autonomous driving scenarios. In response to these challenges, we propose a concise but high-performing network, named CHNet, to achieve high-performance depth completion with an elegant and straightforward architecture. Firstly, we use a fast guidance module to fuse the two sensor features, harnessing abundant auxiliary information derived from the color space. Unlike the prevalent complex guidance modules, our approach adopts an intuitive and cost-effective strategy. In addition, we find and analyze the optimization inconsistency problem for observed and unobserved positions. To mitigate this challenge, we introduce a decoupled depth prediction head, tailored to better discern and predict depth values for both valid and invalid positions, incurring minimal additional inference time. Capitalizing on the dual-encoder and single-decoder architecture, the simplicity of CHNet facilitates an optimal balance between accuracy and computational efficiency. In benchmark evaluations on the KITTI depth completion dataset, CHNet demonstrates competitive performance metrics and inference speeds relative to contemporary state-of-the-art methodologies. To assess the generalizability of our approach, we extend our evaluations to the indoor NYUv2 dataset, where CHNet continues to yield impressive outcomes. The code of this work will be available at https://github.com/lmomoy/CHNet .

Self-Supervised Depth Completion From Direct Visual-LiDAR Odometry in Autonomous Driving

Least Square Estimation Network for Depth Completion

A Depth Estimation Framework Based on Unsupervised Learning and Cross-Modal Translation

Self-supervised Visual-LiDAR Odometry with Flip Consistency

Self-supervised Sparse-to-Dense: Self-supervised Depth Completion from LiDAR and Monocular Camera

Unsupervised Depth Completion From Visual Inertial Odometry

DenseLiDAR: A Real-Time Pseudo Dense Depth Guided Depth Completion Network

Depth Estimation of Traffic Scenes from Image Sequence Using Deep Learning.

Learning Guided Convolutional Network for Depth Completion

Depth Completion from Sparse LiDAR Data with Depth-Normal Constraints

A concise but high-performing network for image guided depth completion in autonomous driving

Depth Completion via Inductive Fusion of Planar LIDAR and Monocular Camera

Real-time depth completion based on LiDAR-stereo for autonomous driving

Unsupervised Monocular Estimation of Depth and Visual Odometry uUsing Attention and Depth-Pose Consistency Loss

Self-supervised Depth Estimation Leveraging Global Perception and Geometric Smoothness Using On-board Videos

FSNet: Redesign Self-Supervised MonoDepth for Full-Scale Depth Prediction for Autonomous Driving

Collaborative Learning of Depth Estimation, Visual Odometry and Camera Relocalization from Monocular Videos.

Self-supervised 3D Object Detection from Monocular Pseudo-LiDAR

Self-Supervised Depth Completion Guided by 3D Perception and Geometry Consistency

Self-Supervised Learning of Depth and Ego-motion for 3D Perception in Human Computer Interaction