VQ-ACE: Efficient Policy Search for Dexterous Robotic Manipulation via Action Chunking Embedding

Chenyu Yang,Davide Liconti,Robert K. Katzschmann
2024-11-06
Abstract:Dexterous robotic manipulation remains a significant challenge due to the high dimensionality and complexity of hand movements required for tasks like in-hand manipulation and object grasping. This paper addresses this issue by introducing Vector Quantized Action Chunking Embedding (VQ-ACE), a novel framework that compresses human hand motion into a quantized latent space, significantly reducing the action space's dimensionality while preserving key motion characteristics. By integrating VQ-ACE with both Model Predictive Control (MPC) and Reinforcement Learning (RL), we enable more efficient exploration and policy learning in dexterous manipulation tasks using a biomimetic robotic hand. Our results show that latent space sampling with MPC produces more human-like behavior in tasks such as Ball Rolling and Object Picking, leading to higher task success rates and reduced control costs. For RL, action chunking accelerates learning and improves exploration, demonstrated through faster convergence in tasks like cube stacking and in-hand cube reorientation. These findings suggest that VQ-ACE offers a scalable and effective solution for robotic manipulation tasks involving complex, high-dimensional state spaces, contributing to more natural and adaptable robotic systems.
Robotics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the high - dimensionality and complexity in dexterous robot manipulation, especially the high - dimensional and complex hand movements required in hand - manipulation tasks (such as in - hand manipulation and object grasping). Specifically: 1. **High - dimensional action space**: The human hand has 27 degrees of freedom (DoF) and is able to perform a variety of complex movements and postures, making the imitation of the fine operations of the human hand a major challenge in the field of robotics. 2. **Complex hand movements**: In order to complete certain specific tasks (such as sphere rolling, object picking, etc.), it is necessary to precisely control these high - dimensional action sequences, which poses high requirements for existing robot systems. To solve these problems, this paper introduces a new framework named **Vector Quantized Action Chunking Embedding (VQ - ACE)**. VQ - ACE significantly reduces the dimension of the action space by compressing human hand movements into a quantized latent space while retaining key motion features. This enables more efficient policy search and learning when using bionic robot hands in dexterous manipulation tasks. ### Main contributions 1. **Propose the VQ - ACE framework**: It is used to embed human hand - action sequences into quantized latent representations. 2. **Propose model predictive control (MPC) based on latent sampling**: This is a real - time action synthesis algorithm that samples in the latent space. 3. **Propose reinforcement learning (RL) based on action chunks**: It improves the exploration ability of RL through action priors and accelerates the learning process. Through these methods, VQ - ACE has demonstrated higher task success rates and lower control costs in multiple experiments, especially in tasks such as sphere rolling and object picking. In addition, it has also accelerated the convergence of RL and improved the exploration efficiency, and has performed particularly well in tasks such as block stacking and in - hand block re - orientation. In conclusion, VQ - ACE provides a scalable and effective solution for robot manipulation tasks involving complex high - dimensional state spaces, which helps to realize more natural and adaptable robot systems.