Visual Odometry with Deep Bidirectional Recurrent Neural Networks.

Fei Xue,Xin Wang,Qiuyuan Wang,Junqiu Wang,Hongbin Zha
DOI: https://doi.org/10.1007/978-3-030-31726-3_20
2019-01-01
Abstract:We propose a novel architecture for learning camera poses from image sequences with an extended 2D LSTM (Long Short-Term Memory). Unlike most of the previous deep learning based VO (Visual Odometry) methods, our model predicts the pose per frame with temporal information from image sequences by adopting a forward-backward process. In addition, we use 3D tensors as basic structures to generate spatial information. The network learns poses in a bottom-up manner by coupling local and global constraints. Experiments demonstrate that on the public KITTI benchmark dataset, our architecture outperforms the state-of-the-art end-to-end methods in term of camera motion prediction and is comparable with model-based methods. The network generalizes well on the Málaga dataset without extra training or fine-tuning.
What problem does this paper attempt to address?