Abstract:This work presents a novel RGB-D-inertial dynamic SLAM method that can enable accurate localisation when the majority of the camera view is occluded by multiple dynamic objects over a long period of time. Most dynamic SLAM approaches either remove dynamic objects as outliers when they account for a minor proportion of the visual input, or detect dynamic objects using semantic segmentation before camera tracking. Therefore, dynamic objects that cause large occlusions are difficult to detect without prior information. The remaining visual information from the static background is also not enough to support localisation when large occlusion lasts for a long period. To overcome these problems, our framework presents a robust visual-inertial bundle adjustment that simultaneously tracks camera, estimates cluster-wise dense segmentation of dynamic objects and maintains a static sparse map by combining dense and sparse features. The experiment results demonstrate that our method achieves promising localisation and object segmentation performance compared to other state-of-the-art methods in the scenario of long-term large occlusion.

What problem does this paper attempt to address?

The paper primarily aims to address the problem of RGB-D-Inertial Simultaneous Localization and Mapping (RGB-D-Inertial SLAM) in indoor dynamic environments, where the camera view is extensively and persistently occluded by multiple dynamic objects. Specifically, the research addresses the following issues: 1. **Localization challenges under prolonged extensive occlusion**: Most existing SLAM methods assume a static environment. However, when robots interact with objects in the scene or collaborate with humans, these dynamic objects can cause prolonged extensive occlusion of the camera view, thereby affecting localization accuracy. 2. **Challenges in dynamic object detection**: Many visual SLAM methods rely on detecting regions of dynamic objects, often based on the assumption that the static background occupies the majority of the camera view, or by directly detecting predefined dynamic object categories through semantic segmentation. However, these methods struggle to work effectively when unknown dynamic objects cause prolonged extensive occlusion. To address the above issues, the paper proposes a novel RGB-D-Inertial dynamic SLAM method that can achieve accurate localization even when the majority of the camera view is persistently occluded by multiple dynamic objects. This method combines dense and sparse features, enabling simultaneous camera tracking, cluster-level dense segmentation of dynamic objects, and maintenance of a static sparse map. Additionally, to enhance the system's robustness against prolonged extensive occlusion, the framework actively removes sparse map points from dynamic object regions and maintains a sparse model of the static background by fusing dense and sparse features. In summary, the main contributions of this research include: - A new method for dynamic object detection that combines sparse and dense features; - A novel bundle adjustment (BA) process that simultaneously provides dense segmentation of dynamic objects, camera tracking, and environment mapping; - An RGB-D-Inertial SLAM method that performs excellently under prolonged extensive occlusion caused by multiple undefined dynamic objects.

RGB-D-Inertial SLAM in Indoor Dynamic Environments with Long-term Large Occlusion

A Framework for Multi-Session RGBD SLAM in Low Dynamic Workspace Environment.

RGB‐D SLAM with Moving Object Tracking in Dynamic Environments

Robust Keyframe-based Dense SLAM with an RGB-D Camera.

DLD-SLAM: RGB-D Visual Simultaneous Localisation and Mapping in Indoor Dynamic Environments Based on Deep Learning

A Dynamic Scene Vision SLAM Method Incorporating Object Detection and Object Characterization

Robust and Efficient RGB-D SLAM in Dynamic Environments

Robust Monocular SLAM in Dynamic Environments

Visual-Inertial Multi-Instance Dynamic SLAM with Object-level Relocalisation

RGB-D Based Visual SLAM Algorithm for Indoor Crowd Environment

RVD-SLAM: A Real-Time Visual SLAM Toward Dynamic Environments Based on Sparsely Semantic Segmentation and Outlier Prior

Towards Real-time Semantic RGB-D SLAM in Dynamic Environments

Real-Time Visual-Inertial Localization Using Semantic Segmentation Towards Dynamic Environments

Accurate RGB-D SLAM in Dynamic Environments Based on Dynamic Visual Feature Removal

An RGB-D SLAM algorithm based on adaptive semantic segmentation in dynamic environment

DIG-SLAM: An Accurate RGB-D SLAM Based on Instance Segmentation and Geometric Clustering for Dynamic Indoor Scenes

A Monocular-Visual SLAM System with Semantic and Optical-Flow Fusion for Indoor Dynamic Environments

PG-SLAM: Photo-realistic and Geometry-aware RGB-D SLAM in Dynamic Environments

Dense RGB-D-Inertial SLAM with Map Deformations

DGS-SLAM: A Fast and Robust RGBD SLAM in Dynamic Environments Combined by Geometric and Semantic Information