Data Association Between Event Streams and Intensity Frames Under Diverse Baselines.

Zhang Dehao,Ding Qiankun,Duan Peiqi,Zhou Chu,Shi Boxin
DOI: https://doi.org/10.1007/978-3-031-20071-7_5
2022-01-01
Abstract:This paper proposes a learning-based framework to associate event streams and intensity frames under diverse camera baselines, to simultaneously benefit camera pose estimation under large baselines and depth estimation under small baselines. Based on the observation that event streams are globally sparse (a small percentage of pixels in global frames are triggered with events) and locally dense (a large percentage of pixels in local patches are triggered with events) in the spatial domain, we put forward a two-stage architecture for matching feature maps. LSparse-Net uses a large receptive field to find sparse matches while SDense-Net uses a small receptive field to find dense matches. Both stages apply Transformer modules with self-attention layers and cross-attention layers to effectively process multi-resolution features from the feature pyramid network backbone. Experimental results on public datasets show a systematic performance improvement for both tasks compared to state-of-the-art methods.
What problem does this paper attempt to address?