End-to-end Manipulator Calligraphy Planning via Variational Imitation Learning

Fangping Xie,Pierre Le Meur,Charith Fernando
2023-04-06
Abstract:Planning from demonstrations has shown promising results with the advances of deep neural networks. One of the most popular real-world applications is automated handwriting using a robotic manipulator. Classically it is simplified as a two-dimension problem. This representation is suitable for elementary drawings, but it is not sufficient for Japanese calligraphy or complex work of art where the orientation of a pen is part of the user expression. In this study, we focus on automated planning of Japanese calligraphy using a three-dimension representation of the trajectory as well as the rotation of the pen tip, and propose a novel deep imitation learning neural network that learns from expert demonstrations through a combination of images and pose data. The network consists of a combination of variational auto-encoder, bi-directional LSTM, and Multi-Layer Perceptron (MLP). Experiments are conducted in a progressive way, and results demonstrate that the proposed approach is successful in completion of tasks for real-world robots, overcoming the distribution shift problem in imitation learning. The source code and dataset will be public.
Robotics,Artificial Intelligence
What problem does this paper attempt to address?
This paper aims to solve the problem of automatic planning of Japanese calligraphy through variational imitation learning. Specifically, the researchers focus on how to use robotic manipulators to automate Japanese calligraphy tasks, which involves trajectory planning and rotation control of the pen tip in three - dimensional space. Traditional calligraphy automation research is usually simplified to a two - dimensional problem, which may be sufficient for simple paintings or Western calligraphy, but is insufficient for Japanese calligraphy or complex artworks that need to consider the expression of pen tip direction. To address this challenge, the authors propose a new deep imitation - learning neural network model. This model can learn from expert demonstrations and be trained with a combination of image and pose data. The model architecture includes Variational Autoencoder (VAE), Bi - directional LSTM, and Multi - Layer Perceptron (MLP). In this way, the model can not only capture the visual state during the writing process, but also handle complex motion sequences, thereby imitating the style and technique of human calligraphers more accurately. In addition, the paper also explores how to enhance the representational ability of the model by introducing Residual Connection Feature Pyramid Network, and how to use data augmentation techniques to improve the generalization ability and robustness of the model. Experimental results show that the proposed model can successfully complete tasks on actual robotic systems and effectively overcome the distribution shift problem in imitation learning.