An Object Attribute Guided Framework for Robot Learning Manipulations from Human Demonstration Videos

Qixiang Zhang,Junhong Chen,Dayong Liang,Huaping Liu,Xiaojing Zhou,Zihan Ye,Wenyin Liu
DOI: https://doi.org/10.1109/iros40897.2019.8967621
2019-01-01
Abstract:Learning manipulations from videos is an inspiriting way for robots to acquire new skills. In this paper, we propose a framework that can generate robotic manipulation plans by observing human demonstration videos without special marks or unnatural demonstrated behaviors. More specifically, the framework contains a video parsing module and a robot execution module. The first module recognizes the demonstrator's actions using two-stream convolution neural networks, and classifies the operated objects by adopting a Mask R-CNN. After that, two XGBoost classifiers are applied to further classify the objects into subject object and patient object respectively, according to the demonstrator's actions. In the second module, a grammar-based parser is used to summarize the videos and generate the common instructions for robot execution. Extensive experiments are conducted on a publicly available video datasets consisting of 273 videos and manifest that our approach is able to learn manipulation plans from demonstration videos with high accuracy (73.36%). Furthermore, we integrate our framework with a humanoid robot Baxter to perform the manipulation learning from demonstration videos, which effectively verifies the performance of our framework.
What problem does this paper attempt to address?