GloFP-MSF: Monocular Scene Flow Estimation with Global Feature Perception

Xuezhi Xiang,Yu Cui,Xi Wang,Mingliang Zhai,Abdulmotaleb El Saddik
DOI: https://doi.org/10.1007/s00530-024-01418-5
IF: 3.9
2024-01-01
Multimedia Systems
Abstract:Monocular scene flow estimation is a task that allows us to obtain 3D structure and 3D motion from consecutive monocular images. Previous monocular scene flow usually focused on the enhancement of image features and motion features directly while neglecting the utilization of motion features and image features in the decoder, which are equally crucial for accurate scene flow estimation. Based on the cross-covariance attention, we propose a global feature perception module (GFPM) and applie it to the decoder, which enables the decoder to utilize the motion features and image features of the current layer as well as the coarse estimation result of the scene flow of the previous layer effectively, thus enhancing the decoder’s recovery of 3D motion information. In addition, we also propose a parallel architecture of self-attention and convolution (PCSA) for feature extraction, which can enhance the global expression ability of extracted image features. Our proposed method demonstrates remarkable performance on the KITTI 2015 dataset, achieving a relative improvement of 17.6
What problem does this paper attempt to address?