Abstract:Self-supervised monocular visual odometry has a crucial advantage of not depending on labels and has shown significant performance in autonomous driving and robotics. However, recent methods suffer from limited feature representations as they depend on coarse semantic masks to handle dynamic objects, resulting in diminished accuracy in dynamic environments. In contrast to these coarse-grained methods, we present Fine-MVO, a novel self-supervised monocular visual odometry that aims to address dynamic objects using implicit fine-grained feature representations, thus achieving excellent accuracy and robustness in dynamic environments. First, Fine-MVO provides an efficient cross-feature augmentation module and a novel loss weight balance strategy to effectively leverage fine-grained features with implicit semantic information, leading to a great improvement in the depth estimation accuracy, especially on object boundaries in the scenes. Secondly, we design a novel pose-feature enhancement module and an effective two-stage training policy to empower the pose network to focus on robust static regions and temporal information, thereby enhancing the pose estimation performance in dynamic and long-term environments. Extensive experimental results demonstrate the excellent accuracy and generalization of Fine-MVO. Specifically, Fine-MVO achieves a remarkable 36.80% improvement in pose accuracy over the state-of-the-art method on the KITTI dataset, which even breaks through the performance of loop closure within geometry-based visual odometry methods. Furthermore, Fine-MVO exhibits satisfactory generalization on the outdoor dataset AirDOS-Shibuya, attaining a notable improvement of 28.22% over current advanced method. Excitingly, Fine-MVO also reveals outstanding generalization on the indoor dataset TUM-RGBD.

MambaVO: Deep Visual Odometry Based on Sequential Matching Refinement and Training Smoothing

PALVO: Visual Odometry Based on Panoramic Annular Lens.

High-Performance Visual Odometry with Two-Stage Local Binocular Ba and Gpu

Design of an Enhanced Visual Odometry by Building and Matching Compressive Panoramic Landmarks Online

PVO: Panoptic Visual Odometry.

DeepAVO: Efficient Pose Refining with Feature Distilling for Deep Visual Odometry

BEV-ODOM: Reducing Scale Drift in Monocular Visual Odometry with BEV Representation

Fine-MVO: Toward Fine-Grained Feature Enhancement for Self-Supervised Monocular Visual Odometry in Dynamic Environments

DF-VO: What Should Be Learnt for Visual Odometry?

Deep Visual Odometry with Adaptive Memory

Deep Visual Odometry with Events and Frames

Learning Generalized Visual Odometry Using Position-Aware Optical Flow and Geometric Bundle Adjustment

DM-VIO: Delayed Marginalization Visual-Inertial Odometry

GSL-VO: A Geometric-Semantic Information Enhanced Lightweight Visual Odometry in Dynamic Environments

Pose Refinement: Bridging the Gap Between Unsupervised Learning and Geometric Methods for Visual Odometry.

MCVO: A Generic Visual Odometry for Arbitrarily Arranged Multi-Cameras

Multimotion Visual Odometry (MVO)

MambaTrack: A Simple Baseline for Multiple Object Tracking with State Space Model

Approaches, Challenges, and Applications for Deep Visual Odometry: Toward Complicated and Emerging Areas

Beyond Learning: Back to Geometric Essence of Visual Odometry via Fusion-Based Paradigm

Unsupervised Monocular Visual-Inertial Odometry Network