Abstract:Light detection and ranging (LiDAR) and stereo cameras are two generally used solutions for perceiving 3D information. The complementary properties of these two sensor modalities motivate a fusion to derive practicable depth sensing toward real‐world applications. Promoted by deep neural network (DNN) techniques, recent works achieve superior performance on accuracy. However, the complex architecture and the sheer number of DNN parameters often lead to poor generalization capacity and non‐real‐time computing. In this paper, we present FastFusion, a three‐stage stereo‐LiDAR deep fusion scheme, which integrates the LiDAR priors into each step of classical stereo‐matching taxonomy, gaining high‐precision dense depth sensing in a real‐time manner. We integrate stereo‐LiDAR information by taking advantage of a compact binary neural network and utilize the proposed cross‐based LiDAR trust aggregation to further fuse the sparse LiDAR measurements in the back‐end of stereo matching. To align the photometrical of the input image and the depth of the estimation, we introduce a refinement network to guarantee consistency. More importantly, we present a graphic processing unit‐based acceleration framework for providing a low‐latency implementation of FastFusion, gaining both accuracy improvement and real‐time responsiveness. In the experiments, we demonstrate the effectiveness and practicability of FastFusion, which obtains a significant speedup over state‐of‐the‐art baselines while achieving comparable accuracy on depth sensing. The video demo for real‐time depth estimation of FastFusion on the real‐world driving scenario is available at https://youtu.be/nP7cls2BA8s.

3D LiDAR and Stereo Fusion using Stereo Matching Network with Conditional Cost Volume Normalization

Sparse LIDAR Measurement Fusion with Joint Updating Cost for Fast Stereo Matching

Expanding Sparse LiDAR Depth and Guiding Stereo Matching for Robust Dense Depth Estimation

K-nearest Neighborhood Based Integration of Time-of-flight Cameras and Passive Stereo for High-Accuracy Depth Maps.

Real-time depth completion based on LiDAR-stereo for autonomous driving

Volumetric Propagation Network: Stereo-LiDAR Fusion for Long-Range Depth Estimation

LSMD-Net: LiDAR-Stereo Fusion with Mixture Density Network for Depth Sensing

SLFNet: A Stereo and LiDAR Fusion Network for Depth Completion

Multi-Dimensional Cooperative Network for Stereo Matching

CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching

FastFusion: Deep stereo‐LiDAR fusion for real‐time high‐precision dense depth sensing

Holistic and Contextual Evidential Stereo-LiDAR Fusion for Depth Estimation

Adaptive Unimodal Cost Volume Filtering for Deep Stereo Matching.

FCNet: Stereo 3D Object Detection with Feature Correlation Networks

Noise-Aware Unsupervised Deep Lidar-Stereo Fusion

Stereo Matching Using Multi-Level Cost Volume and Multi-Scale Feature Constancy

Sparse LiDAR and Stereo Fusion (SLS-Fusion) for Depth Estimationand 3D Object Detection

DLFusion: Painting-Depth Augmenting-LiDAR for Multimodal Fusion 3D Object Detection

CMDFusion: Bidirectional Fusion Network with Cross-modality Knowledge Distillation for LIDAR Semantic Segmentation

Cascaded Feature Interaction Network for Stereo Matching

Multi-Scale Cost Volumes Cascade Network for Stereo Matching