Composable Instructions and Prospection Guided Visuomotor Control for Robotic Manipulation.

Quanquan Shao,Jie Hu,Weiming Wang,Yi Fang,Mingshuo Han,Jin Qi,Jin Ma
DOI: https://doi.org/10.2991/ijcis.d.191017.001
IF: 2.259
2019-01-01
International Journal of Computational Intelligence Systems
Abstract:Deep neural network-based end-to-end visuomotor control for robotic manipulation is becoming a hot issue of robotics field recently. One-hot vector is often used for multi-task situation in this framework. However, it is inflexible using one-hot vector to describe multiple tasks and transmit intentions of humans. This paper proposes a framework by combining composable instructions with visuomotor control for multi-task problems. The framework mainly consists of two modules: variational autoencoder (VAE) networks and long short-term memory (LSTM) networks. Perception information of the environment is encoded by VAE into a small latent space. The embedded perception information and composable instructions are combined by the LSTM module to guide robotic motion based on different intentions. Prospection is also used to learn the purposes of instructions, which means not only predicting the next action but also predicting a sequence of future actions at the same time. To evaluate this framework, a series of experiments are conducted in pick-and-place application scenarios. For new tasks, the framework could obtain a success rate of 91.2%, which means it has a good generalization ability.
What problem does this paper attempt to address?