WoCoCo: Learning Whole-Body Humanoid Control with Sequential Contacts

Chong Zhang,Wenli Xiao,Tairan He,Guanya Shi
2024-06-10
Abstract:Humanoid activities involving sequential contacts are crucial for complex robotic interactions and operations in the real world and are traditionally solved by model-based motion planning, which is time-consuming and often relies on simplified dynamics models. Although model-free reinforcement learning (RL) has become a powerful tool for versatile and robust whole-body humanoid control, it still requires tedious task-specific tuning and state machine design and suffers from long-horizon exploration issues in tasks involving contact sequences. In this work, we propose WoCoCo (Whole-Body Control with Sequential Contacts), a unified framework to learn whole-body humanoid control with sequential contacts by naturally decomposing the tasks into separate contact stages. Such decomposition facilitates simple and general policy learning pipelines through task-agnostic reward and sim-to-real designs, requiring only one or two task-related terms to be specified for each task. We demonstrated that end-to-end RL-based controllers trained with WoCoCo enable four challenging whole-body humanoid tasks involving diverse contact sequences in the real world without any motion priors: 1) versatile parkour jumping, 2) box loco-manipulation, 3) dynamic clap-and-tap dancing, and 4) cliffside climbing. We further show that WoCoCo is a general framework beyond humanoid by applying it in 22-DoF dinosaur robot loco-manipulation tasks.
Robotics,Graphics,Systems and Control
What problem does this paper attempt to address?
### The Problems This Paper Attempts to Solve This paper primarily aims to address the following issues: 1. **Humanoid Robot Control with Multiple Contact Sequences**: Traditionally, tasks involving multiple contact sequences (such as jumping, grasping, dancing, etc.) for humanoid robots are typically solved through model-based motion planning, which is time-consuming and relies on simplified dynamic models. Although model-free Reinforcement Learning (RL) has shown strong potential in complex whole-body control, it still requires tedious task-specific tuning and state machine design, and faces difficulties in long-term sequence exploration. 2. **Simplifying the Exploration Process**: The paper proposes a new framework called WoCoCo (Whole-Body Control with Sequential Contacts), which decomposes tasks into multiple contact phases, each defining specific contact goals and task objectives. This decomposition helps simplify the policy learning process and achieves straightforward sim-to-real transfer through task-agnostic reward design. 3. **Generality and Flexibility**: The WoCoCo framework is not only applicable to humanoid robots but can also be extended to other robot morphologies, such as a 22-degree-of-freedom dinosaur robot. Through a unified reward design and sim-to-real pipeline, the framework demonstrates generality and flexibility across various complex tasks. In summary, this paper aims to solve the problem of humanoid robot control involving multiple contact sequences using the WoCoCo framework, leveraging reinforcement learning methods to achieve efficient, flexible, and general solutions.