Bridging the Robot Perception Gap with Mid-Level Vision

Chi Li,Jonathan Bohren,Gregory D. Hager
DOI: https://doi.org/10.1007/978-3-319-60916-4_1
2017-07-25
Abstract:The practical application of machine perception to support physical manipulation in unstructured environments remains a barrier to the development of intelligent robotic systems. Recently, great progress has been made by the large-scale machine perception community, but these methods have made few contributions to the applied robotic perception. This is in part because such large-scale systems are designed to recognize category labels of large numbers of objects from a single image, rather than highly accurate, efficient, and robust pose estimation in environments for which a robot has reliable prior knowledge. In this paper, we illustrate the potential for synergistic integration of modern computer vision methods into robotics by augmenting a RANSAC-based registration method with a state-of-the art semantic segmentation algorithm. We detail a convolutional architecture for semantic labeling of the scene, modified to operate efficiently using integral images. We combine this labeling with two novel scene parsing variants of RANSAC, and show, on a new RGB-D dataset that contains complex configurations of textureless and highly specular objects, that our method demonstrates improved performance of pose estimation over the unaugmented algorithms.
What problem does this paper attempt to address?