Learning visual object models on a robot using context and appearance cues

Xiang Li,Mohan Sridharan,Catie Meador
DOI: https://doi.org/10.5555/2484920.2485124
2013-01-01
Abstract:Visual object recognition is a key challenge to the deployment of robots in domains characterized by partial observability and unforeseen changes. Sophisticated algorithms developed for modeling and recognizing objects using different visual cues [Mikolajczyk:IJCV04,Porway:PAMI11] are computationally expensive, sensitive to changes in object configurations and environmental factors, and require many training samples and accurate domain knowledge to learn object models, making it difficult for robots to reliably and efficiently model and recognize objects. These challenges are partially offset by the fact that many objects possess unique characteristics (e.g., color and shape) and motion patterns, although these characteristics and patterns are not known in advance and may change over time. Furthermore, only a subset of domain objects are relevant to any given task and a variety of cues can be extracted from images to represent objects. This paper presents an algorithm that enables robots to identify a set of interesting objects, using appearance-based and contextual cues extracted from a small number of images to efficiently learn models of these objects. Robots learn the domain map and consider objects that move to be interesting, using motion cues to identify the corresponding image regions. Object models learned automatically from these regions consist of spatial arrangement of gradient features, graph-based models of neighborhoods of gradient features, parts-based models of image segments, color distributions, and mixture models of local context. The learned models are used for object recognition in novel scenes based on energy minimization and a generative model for information fusion. All algorithms are evaluated on wheeled robots in indoor and outdoor domains.
What problem does this paper attempt to address?