Natural Language Instruction Understanding for Robotic Manipulation: a Multisensory Perception Approach.

Weihua Wang,Xiaofei Li,Yanzhi Dong,Jun Xie,Di Guo,Huaping Liu
DOI: https://doi.org/10.1109/icra48891.2023.10160906
2023-01-01
Abstract:It has always been expected that the robot can understand the natural language instruction and thus a more natural human-robot interaction is achieved. Currently, the robot usually interprets the instruction by visually grounding the textual information to its surroundings, while it may be not enough for some complex situations with only visual perception. So it is reasonable for the robot to leverage its multisensory perception ability to better understand the instruction. In this paper, we propose a multisensory perception approach to tackle the task of natural language instruction understanding for robotic manipulation, in which the robot coordinates its visual, tactile and auditory perception to fully understand the instruction and then executes the manipulation task. Extensive experiments have been conducted demonstrating the superiority of the multisensory perception compared with single sensory perception for instruction understanding. Moreover, we establish a user-friendly human-robot interaction interface where the human sends instruction to the robot via a mobile APP.
What problem does this paper attempt to address?