Salient Sparse Visual Odometry With Pose-Only Supervision

Siyu Chen,Kangcheng Liu,Chen Wang,Shenghai Yuan,Jianfei Yang,Lihua Xie
DOI: https://doi.org/10.1109/LRA.2024.3384757
2024-04-07
Abstract:Visual Odometry (VO) is vital for the navigation of autonomous systems, providing accurate position and orientation estimates at reasonable costs. While traditional VO methods excel in some conditions, they struggle with challenges like variable lighting and motion blur. Deep learning-based VO, though more adaptable, can face generalization problems in new environments. Addressing these drawbacks, this paper presents a novel hybrid visual odometry (VO) framework that leverages pose-only supervision, offering a balanced solution between robustness and the need for extensive labeling. We propose two cost-effective and innovative designs: a self-supervised homographic pre-training for enhancing optical flow learning from pose-only labels and a random patch-based salient point detection strategy for more accurate optical flow patch extraction. These designs eliminate the need for dense optical flow labels for training and significantly improve the generalization capability of the system in diverse and challenging environments. Our pose-only supervised method achieves competitive performance on standard datasets and greater robustness and generalization ability in extreme and unseen scenarios, even compared to dense optical flow-supervised state-of-the-art methods.
Computer Vision and Pattern Recognition,Robotics
What problem does this paper attempt to address?
The paper primarily addresses several key issues in the field of Visual Odometry (VO): 1. **Limitations of traditional VO methods**: Traditional geometry-based methods perform well under certain conditions but their performance degrades in challenging situations such as lighting changes and motion blur. 2. **Generalization problem of deep learning VO methods**: Although deep learning-based methods are more flexible, their performance decreases when encountering new environments different from the training data. To address the above issues, the paper proposes a new hybrid visual odometry framework that utilizes Pose-only Supervision, aiming to balance system robustness and the need for large amounts of labeled data. Specifically, this method includes two core designs: - **Self-supervised homomorphic pre-training**: This method enhances the ability to learn optical flow from pose-only labels, thereby providing guidance for subsequent sparse optical flow estimation tasks. - **Saliency random block detection strategy**: A salient point detection module is used to identify points with significant features in the image, and these points are used as the basis for tracking to improve the accuracy of optical flow estimation. Additionally, the paper introduces a weighted bundle adjustment layer to optimize the camera's pose and depth information. Overall, this method not only improves performance on standard datasets but also demonstrates stronger robustness and generalization in extreme and unknown scenarios, even compared to advanced methods that require dense optical flow supervision. In summary, the proposed method aims to improve the performance of visual odometry systems by combining self-supervised learning and salient feature detection techniques, particularly enhancing their robustness and generalization in complex environments.