Abstract:Stereopsis has widespread appeal in robotics as it is the predominant way by which living beings perceive depth to navigate our 3D world. Event cameras are novel bio-inspired sensors that detect per-pixel brightness changes asynchronously, with very high temporal resolution and high dynamic range, enabling machine perception in high-speed motion and broad illumination conditions. The high temporal precision also benefits stereo matching, making disparity (depth) estimation a popular research area for event cameras ever since its inception. Over the last 30 years, the field has evolved rapidly, from low-latency, low-power circuit design to current deep learning (DL) approaches driven by the computer vision community. The bibliography is vast and difficult to navigate for non-experts due its highly interdisciplinary nature. Past surveys have addressed distinct aspects of this topic, in the context of applications, or focusing only on a specific class of techniques, but have overlooked stereo datasets. This survey provides a comprehensive overview, covering both instantaneous stereo and long-term methods suitable for simultaneous localization and mapping (SLAM), along with theoretical and empirical comparisons. It is the first to extensively review DL methods as well as stereo datasets, even providing practical suggestions for creating new benchmarks to advance the field. The main advantages and challenges faced by event-based stereo depth estimation are also discussed. Despite significant progress, challenges remain in achieving optimal performance in not only accuracy but also efficiency, a cornerstone of event-based computing. We identify several gaps and propose future research directions. We hope this survey inspires future research in this area, by serving as an accessible entry point for newcomers, as well as a practical guide for seasoned researchers in the community.

LiDAR-Event Stereo Fusion with Hallucinations

Reliable Fusion of ToF and Stereo Data Based on Joint Depth Filter

Sparse LIDAR Measurement Fusion with Joint Updating Cost for Fast Stereo Matching

Expanding Sparse LiDAR Depth and Guiding Stereo Matching for Robust Dense Depth Estimation

Multi‐Event‐Camera Depth Estimation and Outlier Rejection by Refocused Events Fusion

Event-Based Stereo Depth Estimation Using Belief Propagation

Learning Local Event-based Descriptor for Patch-based Stereo Matching

Edge-Guided Fusion and Motion Augmentation for Event-Image Stereo

Stereo-Depth Fusion through Virtual Pattern Projection

Semi-Dense 3D Reconstruction with a Stereo Event Camera

Instantaneous Stereo Depth Estimation of Real-World Stimuli with a Neuromorphic Stereo-Vision Setup

K-nearest Neighborhood Based Integration of Time-of-flight Cameras and Passive Stereo for High-Accuracy Depth Maps.

Event-driven stereo matching for real-time 3D panoramic vision

3D LiDAR and Stereo Fusion using Stereo Matching Network with Conditional Cost Volume Normalization

Temporal Event Stereo via Joint Learning with Stereoscopic Flow

Event-Based Stereo Depth Estimation by Temporal-Spatial Context Learning

Real-time depth completion based on LiDAR-stereo for autonomous driving

Holistic and Contextual Evidential Stereo-LiDAR Fusion for Depth Estimation

LSMD-Net: LiDAR-Stereo Fusion with Mixture Density Network for Depth Sensing

FastFusion: Deep stereo‐LiDAR fusion for real‐time high‐precision dense depth sensing

Event-based Stereo Depth Estimation: A Survey