Abstract:Planning from demonstrations has shown promising results with the advances of deep neural networks. One of the most popular real-world applications is automated handwriting using a robotic manipulator. Classically it is simplified as a two-dimension problem. This representation is suitable for elementary drawings, but it is not sufficient for Japanese calligraphy or complex work of art where the orientation of a pen is part of the user expression. In this study, we focus on automated planning of Japanese calligraphy using a three-dimension representation of the trajectory as well as the rotation of the pen tip, and propose a novel deep imitation learning neural network that learns from expert demonstrations through a combination of images and pose data. The network consists of a combination of variational auto-encoder, bi-directional LSTM, and Multi-Layer Perceptron (MLP). Experiments are conducted in a progressive way, and results demonstrate that the proposed approach is successful in completion of tasks for real-world robots, overcoming the distribution shift problem in imitation learning. The source code and dataset will be public.

What problem does this paper attempt to address?

This paper aims to solve the problem of automatic planning of Japanese calligraphy through variational imitation learning. Specifically, the researchers focus on how to use robotic manipulators to automate Japanese calligraphy tasks, which involves trajectory planning and rotation control of the pen tip in three - dimensional space. Traditional calligraphy automation research is usually simplified to a two - dimensional problem, which may be sufficient for simple paintings or Western calligraphy, but is insufficient for Japanese calligraphy or complex artworks that need to consider the expression of pen tip direction. To address this challenge, the authors propose a new deep imitation - learning neural network model. This model can learn from expert demonstrations and be trained with a combination of image and pose data. The model architecture includes Variational Autoencoder (VAE), Bi - directional LSTM, and Multi - Layer Perceptron (MLP). In this way, the model can not only capture the visual state during the writing process, but also handle complex motion sequences, thereby imitating the style and technique of human calligraphers more accurately. In addition, the paper also explores how to enhance the representational ability of the model by introducing Residual Connection Feature Pyramid Network, and how to use data augmentation techniques to improve the generalization ability and robustness of the model. Experimental results show that the proposed model can successfully complete tasks on actual robotic systems and effectively overcome the distribution shift problem in imitation learning.

End-to-end Manipulator Calligraphy Planning via Variational Imitation Learning

Learning Robot Manipulation Skills from Human Demonstration Videos Using Two-Stream 2-D/3-D Residual Networks with Self-Attention

Human Demonstration Trajectory Refinement for Redundant Manipulators.

Learning Robotic Manipulation through Visual Planning and Acting

Conditional Variational Auto Encoder Based Dynamic Motion for Multi-task Imitation Learning

CalliRewrite: Recovering Handwriting Behaviors from Calligraphy Images without Supervision

Image-based Calligraphy Information Extraction and Manipulator Copying

Learning Robotic Manipulation from Demonstrations by Combining Deep Generative Model and Dynamic Control System

Learning to Imagine Manipulation Goals for Robot Task Planning

Composable Instructions and Prospection Guided Visuomotor Control for Robotic Manipulation.

Intelligent Chinese calligraphy beautification from handwritten characters for robotic writing

Vision-Based Multi-Task Manipulation for Inexpensive Robots Using End-To-End Learning from Demonstration

An Image Based Visual Servo Approach with Deep Learning for Robotic Manipulation

Learning Neuro-symbolic Programs for Language Guided Robot Manipulation

A Generalized Robotic Handwriting Learning System based on Dynamic Movement Primitives (DMPs)

A Human–Robot Collaboration Method Using a Pose Estimation Network for Robot Learning of Assembly Manipulation Trajectories From Demonstration Videos

Watch and Act: Learning Robotic Manipulation from Visual Demonstration.

Learning Task Planning from Multi-Modal Demonstration for Multi-Stage Contact-Rich Manipulation

Real-time Obstacle Avoidance in Robotic Manipulation Using Imitation Learning

Vision-based Robotic Arm Imitation by Human Gesture

Concept2Robot: Learning Manipulation Concepts from Instructions and Human Demonstrations