Abstract:The inherent scale ambiguity issue greatly limits the performance of monocular visual odometry. In recent years, a variety of methods have been proposed for self-supervised learning of ego-motion and depth estimation, incorporating specifically designed scale-consistency constraints that utilize estimated depth as a reference. However, these existing methods neglect the influence of the depth uncertainty introduced by the dominant photometric loss, which leads to unreliable depth estimation in difficult regions and detrimentally affects scale alignment. To solve these problems, we introduces a feature-based visual odometry learning system with an effective scale recovery strategy in this paper. Additionally, we propose a learning method to estimate the photometric-sensitive depth uncertainty for guiding the scale recovery. The proposed method is evaluated on KITTI odometry, and the experimental results demonstrate that our system can predict scale-consistent trajectories from monocular videos and achieves state-of-the-art performance. Moreover, the proposed method achieves competitive performance on KITTI depth estimation.

Self-Supervised Learning of Monocular Visual Odometry and Depth with Uncertainty-Aware Scale Consistency