Abstract:The combination of range sensors with color cameras can be very useful for robot navigation, semantic perception, manipulation, and telepresence. Several methods of combining range- and color-data have been investigated and successfully used in various robotic applications. Most of these systems suffer from the problems of noise in the range-data and resolution mismatch between the range sensor and the color cameras, since the resolution of current range sensors is much less than the resolution of color cameras. High-resolution depth maps can be obtained using stereo matching, but this often fails to construct accurate depth maps of weakly/repetitively textured scenes, or if the scene exhibits complex self-occlusions. Range sensors provide coarse depth information regardless of presence/absence of texture. The use of a calibrated system, composed of a time-of-flight (TOF) camera and of a stereoscopic camera pair, allows data fusion thus overcoming the weaknesses of both individual sensors. We propose a novel TOF-stereo fusion method based on an efficient seed-growing algorithm which uses the TOF data projected onto the stereo image pair as an initial set of correspondences. These initial "seeds" are then propagated based on a Bayesian model which combines an image similarity score with rough depth priors computed from the low-resolution range data. The overall result is a dense and accurate depth map at the resolution of the color cameras at hand. We show that the proposed algorithm outperforms 2D image-based stereo algorithms and that the results are of higher resolution than off-the-shelf color-range sensors, e.g., Kinect. Moreover, the algorithm potentially exhibits real-time performance on a single CPU.

Achieving RGB-D level Segmentation Performance from a Single ToF Camera

DELTAR: Depth Estimation from a Light-Weight ToF Sensor and RGB Image

A Depth Estimation Framework Based on Unsupervised Learning and Cross-Modal Translation

Deep End-to-End Alignment and Refinement for Time-of-Flight RGB-D Module

RGB Guided ToF Imaging System: A Survey of Deep Learning-Based Methods

FloatingFusion: Depth from ToF and Image-stabilized Stereo Cameras

An RGB-D Fusion Based Semantic Segmentation Algorithm Based on Neighborhood Metric Relations

K-nearest Neighborhood Based Integration of Time-of-flight Cameras and Passive Stereo for High-Accuracy Depth Maps.

Multi-resolution Cascaded Network with Depth-similar Residual Module for Real-time Semantic Segmentation on RGB-D Images.

Depth Matters: Exploring Deep Interactions of RGB-D for Semantic Segmentation in Traffic Scenes

High-Resolution Depth Maps Based on TOF-Stereo Fusion

PanDepth: Joint Panoptic Segmentation and Depth Completion

Exploiting Depth from Single Monocular Images for Object Detection and Semantic Segmentation

Real-Time Joint Semantic Segmentation and Depth Estimation Using Asymmetric Annotations

Rethinking RGB-D Fusion for Semantic Segmentation in Surgical Datasets

Unveiling the Depths: A Multi-Modal Fusion Framework for Challenging Scenarios

Scale Invariant Semantic Segmentation with RGB-D Fusion

Salient Object Detection for RGBD Video Via Spatial Interaction and Depth-Based Boundary Refinement

Spatio-Temporal Fusion of LiDAR and Camera Data for Omnidirectional Depth Perception

DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation