Abstract:Recently, generating dense maps in real-time has become a hot research topic in the mobile robotics community, since dense maps can provide more informative and continuous features compared with sparse maps. Implicit depth representation (e.g., the depth code) derived from deep neural networks has been employed in the visual-only or visual-inertial simultaneous localization and mapping (SLAM) systems, which achieve promising performances on both camera motion and local dense geometry estimations from monocular images. However, the existing visual-inertial SLAM systems combined with depth codes are either built on a filter-based SLAM framework, which can only update poses and maps in a relatively small local time window, or based on a loosely-coupled framework, while the prior geometric constraints from the depth estimation network have not been employed for boosting the state estimation. To well address these drawbacks, we propose DiT-SLAM, a novel real-time Dense visual-inertial SLAM with implicit depth representation and Tightly-coupled graph optimization. Most importantly, the poses, sparse maps, and low-dimensional depth codes are optimized with the tightly-coupled graph by considering the visual, inertial, and depth residuals simultaneously. Meanwhile, we propose a light-weight monocular depth estimation and completion network, which is combined with attention mechanisms and the conditional variational auto-encoder (CVAE) to predict the uncertainty-aware dense depth maps from more low-dimensional codes. Furthermore, a robust point sampling strategy introducing the spatial distribution of 2D feature points is also proposed to provide geometric constraints in the tightly-coupled optimization, especially for textureless or featureless cases in indoor environments. We evaluate our system on open benchmarks. The proposed methods achieve better performances on both the dense depth estimation and the trajectory estimation compared to the baseline and other systems.

Volume-DROID: A Real-Time Implementation of Volumetric Mapping with DROID-SLAM

DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras

Volumetric Instance-Aware Semantic Mapping and 3D Object Discovery

A Novel Lidar-Assisted Monocular Visual SLAM Framework for Mobile Robots in Outdoor Environments

Real-time large-scale dense RGB-D SLAM with volumetric fusion

Real-Time Optimization-Based Dense Mapping System of RGBD-Inertial Odometry

RVD-SLAM: A Real-Time Visual SLAM Toward Dynamic Environments Based on Sparsely Semantic Segmentation and Outlier Prior

SDVL: Efficient and Accurate Semi-Direct Visual Localization

Neural Implicit Dense Semantic SLAM

VDO-SLAM: A Visual Dynamic Object-aware SLAM System

DRV-SLAM: An Adaptive Real-Time Semantic Visual SLAM Based on Instance Segmentation Toward Dynamic Environments

Tightly-Coupled LiDAR-Visual-Inertial SLAM and Large-Scale Volumetric Occupancy Mapping

Three-Dimensional Lidar Localization and Mapping with Loop-Closure Detection Based on Dense Depth Information

DOPESLAM: High-Precision ROS-Based Semantic 3D SLAM in a Dynamic Environment

DiT-SLAM: Real-Time Dense Visual-Inertial SLAM with Implicit Depth Representation and Tightly-Coupled Graph Optimization

Direct LiDAR-Inertial Odometry and Mapping: Perceptive and Connective SLAM

DGS-SLAM: A Fast and Robust RGBD SLAM in Dynamic Environments Combined by Geometric and Semantic Information

VDBFusion: Flexible and Efficient TSDF Integration of Range Sensor Data

D3L-SLAM: A Comprehensive Hybrid Simultaneous Location and Mapping System with Deep Keypoint, Deep Depth, Deep Pose, and Line Detection

OVD-SLAM: An Online Visual SLAM for Dynamic Environments