Deformable Object Manipulation Using Human Demonstration Enhanced Deep Deterministic Policy Gradient

Zihao Dong,Jian Huang,Haoyuan Wang,Bo Yang,Dongrui Wu,Yaonan Zhu,Yasuhisa Hasegawa
DOI: https://doi.org/10.1109/mhs59931.2023.10510085
2023-01-01
Abstract:Reinforcement learning (RL) has demonstrated potential in addressing deformable object manipulation problems. However, significant challenges remain regarding the complexity of these tasks and the limited efficiency of policy initialization with human demonstrations (HD), which may necessitate substantial data and time for convergence. This paper proposes human demonstrations enhanced deep deterministic policy gradients (HDE-DDPG) to simplify the deformable object manipulation problem and maximize the benefits of HD in RL. The complexity of manipulation tasks is reduced by carrying out conditional policy training and utilizing a fuzzy classification system known as high-dimensional Takagi–Sugeno–Kang (HTSK) for grasp point selection. The output of the HTSK is then fed as input to the actor to determine the placement position. By training the critic to assign higher scores to HD actions during pretraining, the training of the critic is accelerated. As a result of an improved critic, behavior cloning (BC) is used more frequently and appropriately, which fully exploits HD and accelerates the training of the entire system. HDE-DDPG was evaluated through ablation experiments, and the results demonstrate that our method significantly speeds up the agent's training while achieving superior and consistent performance.
What problem does this paper attempt to address?