Towards Scale-Aware Self-Supervised Multi-Frame Depth Estimation with IMU Motion Dynamics.

Yipeng Lu,Denghui Zhang,Zhaoquan Gu,Jing Qiu
DOI: https://doi.org/10.1109/DSC59305.2023.00023
2023-01-01
Abstract:In recent years, self-supervised depth and ego- motion estimation have attracted extensive research attention. Self-supervised monocular depth estimation can effectively learn information from texture-less and non-Lambertian surfaces. However, the monocular methods' accuracy is limited owing to the depth ambiguity in the imaging principles of monocular cameras. On the other hand, multi-frame methods usually have higher depth accuracy benefitting from the geometric constraints of Multi-View Stereo (MVS). However, MVS is sensitive to non- Lambertian surfaces, moving objects, and texture-less regions, which often leads to wrong estimation. In addition, both methods are difficult to obtain the absolute scale result owing to the essential scale ambiguity of monocular images. To resolve the above problems, we propose the FusionDepth, a scale-aware method that fuses the information from monocular depth estimation, multi-frame depth estimation, and Inertial Measurement Unit (IMU) in this paper. FusionDepth fuses multi-frame depth estimation and monocular depth estimation through an uncertainty mask to realize complementary advantages. It further fuses IMU information with an Extended Kalman Filter (EKF) to learn an absolute scale metric. To run the model in real-time on resource-limited devices, we compact the framework by sharing the encoder and taking two lightweight decoders to lower the expense of pose prediction and the depth prediction of MVS. We verify FusionDepth's effectiveness on the KITTI benchmark and the comparison experiments indicate our method is efficient and accurate.
What problem does this paper attempt to address?