RGB-D Odometry and SLAM

Javier Civera,Seong Hun Lee
DOI: https://doi.org/10.1007/978-3-030-28603-3_6
2020-01-20
Abstract:The emergence of modern RGB-D sensors had a significant impact in many application fields, including robotics, augmented reality (AR) and 3D scanning. They are low-cost, low-power and low-size alternatives to traditional range sensors such as LiDAR. Moreover, unlike RGB cameras, RGB-D sensors provide the additional depth information that removes the need of frame-by-frame triangulation for 3D scene reconstruction. These merits have made them very popular in mobile robotics and AR, where it is of great interest to estimate ego-motion and 3D scene structure. Such spatial understanding can enable robots to navigate autonomously without collisions and allow users to insert virtual entities consistent with the image stream. In this chapter, we review common formulations of odometry and Simultaneous Localization and Mapping (known by its acronym SLAM) using RGB-D stream input. The two topics are closely related, as the former aims to track the incremental camera motion with respect to a local map of the scene, and the latter to jointly estimate the camera trajectory and the global map with consistency. In both cases, the standard approaches minimize a cost function using nonlinear optimization techniques. This chapter consists of three main parts: In the first part, we introduce the basic concept of odometry and SLAM and motivate the use of RGB-D sensors. We also give mathematical preliminaries relevant to most odometry and SLAM algorithms. In the second part, we detail the three main components of SLAM systems: camera pose tracking, scene mapping and loop closing. For each component, we describe different approaches proposed in the literature. In the final part, we provide a brief discussion on advanced research topics with the references to the state-of-the-art.
Computer Vision and Pattern Recognition,Robotics
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to estimate the camera's motion (i.e., displacement and rotation) and construct the 3D structure of the scene by using RGB - D sensor data. Specifically, the paper focuses on two closely - related topics: 1. **RGB - D Odometry**: The goal is to estimate the incremental motion of the camera relative to the local scene map when the camera moves. This usually involves extracting features from a continuous RGB - D image stream and using these features to estimate the relative position change of the camera between different time points. 2. **Simultaneous Localization and Mapping (SLAM)**: The goal is to jointly estimate the camera's trajectory and the global map of the scene, ensuring consistency between the two. This requires not only being able to update the camera's position in real - time during the exploration of an unknown environment but also gradually constructing an accurate 3D model of the environment. To achieve the above - mentioned goals, the paper reviews common odometry and SLAM algorithms, especially those methods based on RGB - D input streams. These methods are usually achieved by minimizing a certain cost function, which can be photometric error or geometric error, and are solved using nonlinear optimization techniques. In addition, the paper also details the three main components in the SLAM system: camera pose tracking, scene mapping, and loop closure, and describes various methods proposed in the literature. Finally, the paper discusses some advanced research topics and provides relevant references.