RGB-D-Inertial SLAM in Indoor Dynamic Environments with Long-term Large Occlusion

Ran Long,Christian Rauch,Vladimir Ivan,Tin Lun Lam,Sethu Vijayakumar
2023-03-23
Abstract:This work presents a novel RGB-D-inertial dynamic SLAM method that can enable accurate localisation when the majority of the camera view is occluded by multiple dynamic objects over a long period of time. Most dynamic SLAM approaches either remove dynamic objects as outliers when they account for a minor proportion of the visual input, or detect dynamic objects using semantic segmentation before camera tracking. Therefore, dynamic objects that cause large occlusions are difficult to detect without prior information. The remaining visual information from the static background is also not enough to support localisation when large occlusion lasts for a long period. To overcome these problems, our framework presents a robust visual-inertial bundle adjustment that simultaneously tracks camera, estimates cluster-wise dense segmentation of dynamic objects and maintains a static sparse map by combining dense and sparse features. The experiment results demonstrate that our method achieves promising localisation and object segmentation performance compared to other state-of-the-art methods in the scenario of long-term large occlusion.
Robotics
What problem does this paper attempt to address?
The paper primarily aims to address the problem of RGB-D-Inertial Simultaneous Localization and Mapping (RGB-D-Inertial SLAM) in indoor dynamic environments, where the camera view is extensively and persistently occluded by multiple dynamic objects. Specifically, the research addresses the following issues: 1. **Localization challenges under prolonged extensive occlusion**: Most existing SLAM methods assume a static environment. However, when robots interact with objects in the scene or collaborate with humans, these dynamic objects can cause prolonged extensive occlusion of the camera view, thereby affecting localization accuracy. 2. **Challenges in dynamic object detection**: Many visual SLAM methods rely on detecting regions of dynamic objects, often based on the assumption that the static background occupies the majority of the camera view, or by directly detecting predefined dynamic object categories through semantic segmentation. However, these methods struggle to work effectively when unknown dynamic objects cause prolonged extensive occlusion. To address the above issues, the paper proposes a novel RGB-D-Inertial dynamic SLAM method that can achieve accurate localization even when the majority of the camera view is persistently occluded by multiple dynamic objects. This method combines dense and sparse features, enabling simultaneous camera tracking, cluster-level dense segmentation of dynamic objects, and maintenance of a static sparse map. Additionally, to enhance the system's robustness against prolonged extensive occlusion, the framework actively removes sparse map points from dynamic object regions and maintains a sparse model of the static background by fusing dense and sparse features. In summary, the main contributions of this research include: - A new method for dynamic object detection that combines sparse and dense features; - A novel bundle adjustment (BA) process that simultaneously provides dense segmentation of dynamic objects, camera tracking, and environment mapping; - An RGB-D-Inertial SLAM method that performs excellently under prolonged extensive occlusion caused by multiple undefined dynamic objects.