On combining visual SLAM and visual odometry

Brian Williams,Ian Reid
DOI: https://doi.org/10.1109/robot.2010.5509248
2010-05-01
Abstract:Sequential monocular SLAM systems perform drift free tracking of the pose of a camera relative to a jointly estimated map of landmarks. To allow real-time operation in moderately sized environments, the map is kept quite spare with usually only tens of landmarks visible in each frame. In contrast, visual odometry techniques track hundreds of visual features per frame. This leads to a very accurate estimate of the relative camera motion, but without a persistent map, the estimate tends to drift over time. We demonstrate a new monocular SLAM system which combines the benefits of these two techniques. In addition to maintaining a sparse map of landmarks in the world, our system finds as many inter frame point matches as possible. These point matches provide additional constraints on the inter-frame motion of the camera leading to a more accurate pose estimate, and, since they are not maintained as full map landmarks, they do not cause a large increase in the computational cost. Our results in both a simulated environment and in real video demonstrate the improvement in estimation accuracy gained by the inclusion of visual odometry style observations. The constraints available from pairwise point matches are most naturally cast in the context of a camera-centric rather than world-centric frame. To that end we recast the usual world-centric EKF implementation of visual SLAM in a robo-centric frame. We show that this robo-centric visual SLAM, as expected, leads to the estimated uncertainty more closely matching the ideal uncertainty; i.e., that robo-centric visual SLAM yields a more consistent estimate than the traditional world-centric EKF algorithm.
What problem does this paper attempt to address?