Temporal-Aware SfM-Learner: Unsupervised Learning Monocular Depth and Motion from Stereo Video Clips.

Lanqing Zhang,Ge Li,Thomas H. Li
DOI: https://doi.org/10.1109/mipr49039.2020.00059
2020-01-01
Abstract:Unsupervised Learning based monocular visual odometry is generating considerable interests due to its learning capability without needing labeled data and robustness to environmental variations. However, most of the existing unsupervised learning methods provide far less accurate results than geometry based counterparts. It is confirmed that lacking of drift correction procedure could cause significant adverse effects. To solve this problem, we propose a LSTM based end-to-end VO system, which mainly uses temporal/spatial photometric losses to generate depth maps and 6-DoF relative pose transformations. It can also leverage geometric consistencies to improve performance and robustness over SE(3) by constructing a new loss function on pose graph recursively. Detailed quantitative and qualitative evaluations on KITTI show that: 1) Our method can outperform most of unsupervised learning methods on pose estimation tasks. 2) Our system also gives competitive results on depth estimation tasks as well.
What problem does this paper attempt to address?