Abstract:Recently, generating dense maps in real-time has become a hot research topic in the mobile robotics community, since dense maps can provide more informative and continuous features compared with sparse maps. Implicit depth representation (e.g., the depth code) derived from deep neural networks has been employed in the visual-only or visual-inertial simultaneous localization and mapping (SLAM) systems, which achieve promising performances on both camera motion and local dense geometry estimations from monocular images. However, the existing visual-inertial SLAM systems combined with depth codes are either built on a filter-based SLAM framework, which can only update poses and maps in a relatively small local time window, or based on a loosely-coupled framework, while the prior geometric constraints from the depth estimation network have not been employed for boosting the state estimation. To well address these drawbacks, we propose DiT-SLAM, a novel real-time Dense visual-inertial SLAM with implicit depth representation and Tightly-coupled graph optimization. Most importantly, the poses, sparse maps, and low-dimensional depth codes are optimized with the tightly-coupled graph by considering the visual, inertial, and depth residuals simultaneously. Meanwhile, we propose a light-weight monocular depth estimation and completion network, which is combined with attention mechanisms and the conditional variational auto-encoder (CVAE) to predict the uncertainty-aware dense depth maps from more low-dimensional codes. Furthermore, a robust point sampling strategy introducing the spatial distribution of 2D feature points is also proposed to provide geometric constraints in the tightly-coupled optimization, especially for textureless or featureless cases in indoor environments. We evaluate our system on open benchmarks. The proposed methods achieve better performances on both the dense depth estimation and the trajectory estimation compared to the baseline and other systems.

M-DIVO: Multiple ToF RGB-D Cameras Enhanced Depth-Inertial-Visual Odometry

R2DIO: A Robust and Real-Time Depth-Inertial Odometry Leveraging Multimodal Constraints for Challenging Environments

VIDO: A Robust and Consistent Monocular Visual-Inertial-Depth Odometry

R 2 DIO: A Robust and Real-Time Depth-Inertial Odometry Leveraging Multi-Modal Constraints for Challenging Environments

DVIO: Depth aided visual inertial odometry for RGBD sensors

Multi-Sensor Fusion Self-Supervised Deep Odometry and Depth Estimation

LVIO-SAM: A Multi-sensor Fusion Odometry via Smoothing and Mapping

Real-Time Optimization-Based Dense Mapping System of RGBD-Inertial Odometry

Depth Enhanced Visual-Inertial Odometry Based on Multi-State Constraint Kalman Filter

DiT-SLAM: Real-Time Dense Visual-Inertial SLAM with Implicit Depth Representation and Tightly-Coupled Graph Optimization

A Depth-added Visual-Inertial Odometry Based on MEMS IMU with Fast Initialization

Vins-Mkf: A Tightly-Coupled Multi-Keyframe Visual-Inertial Odometry For Accurate And Robust State Estimation

RMSC-VIO: Robust Multi-Stereoscopic Visual-Inertial Odometry for Local Visually Challenging Scenarios

Tightly Coupled Visual-Inertial Fusion with Image Enhancement for Robust Positioning

Real-Time Dense Construction with Deep Multiview Stereo Using Camera and IMU Sensors

DVIO: an Optimization-Based Tightly Coupled Direct Visual-Inertial Odometry

Direct Visual-Inertial Odometry with Semi-Dense Mapping

Real-Time Dense Visual Odometry for RGB-D Cameras

FAST-LIVO: Fast and Tightly-coupled Sparse-Direct LiDAR-Inertial-Visual Odometry

4DRVO-Net: Deep 4D Radar-Visual Odometry Using Multi-Modal and Multi-Scale Adaptive Fusion