Learning Monocular Depth from Focus with Event Focal Stack

Chenxu Jiang,Mingyuan Lin,Chi Zhang,Zhenghai Wang,Lei Yu
DOI: https://doi.org/10.1109/jsen.2024.3495816
IF: 4.3
2024-01-01
IEEE Sensors Journal
Abstract:Depth from Focus estimates depth by determining the moment of maximum focusfrom multiple shots at different focal distances, i.e. the Focal Stack.However, the limited sampling rate of conventional optical cameras makes itdifficult to obtain sufficient focus cues during the focal sweep. Inspired bybiological vision, the event camera records intensity changes over time inextremely low latency, which provides more temporal information for focus timeacquisition. In this study, we propose the EDFF Network to estimate sparsedepth from the Event Focal Stack. Specifically, we utilize the event voxel gridto encode intensity change information and project event time surface into thedepth domain to preserve per-pixel focal distance information. AFocal-Distance-guided Cross-Modal Attention Module is presented to fuse theinformation mentioned above. Additionally, we propose a Multi-level DepthFusion Block designed to integrate results from each level of a UNet-likearchitecture and produce the final output. Extensive experiments validate thatour method outperforms existing state-of-the-art approaches.
What problem does this paper attempt to address?