Opt2Skill: Imitating Dynamically-feasible Whole-Body Trajectories for Versatile Humanoid Loco-Manipulation

Fukang Liu,Zhaoyuan Gu,Yilin Cai,Ziyi Zhou,Shijie Zhao,Hyunyoung Jung,Sehoon Ha,Yue Chen,Danfei Xu,Ye Zhao

2024-10-29

Abstract:Humanoid robots are designed to perform diverse loco-manipulation tasks. However, they face challenges due to their high-dimensional and unstable dynamics, as well as the complex contact-rich nature of the tasks. Model-based optimal control methods offer precise and systematic control but are limited by high computational complexity and accurate contact sensing. On the other hand, reinforcement learning (RL) provides robustness and handles high-dimensional spaces but suffers from inefficient learning, unnatural motion, and sim-to-real gaps. To address these challenges, we introduce Opt2Skill, an end-to-end pipeline that combines model-based trajectory optimization with RL to achieve robust whole-body loco-manipulation. We generate reference motions for the Digit humanoid robot using differential dynamic programming (DDP) and train RL policies to track these trajectories. Our results demonstrate that Opt2Skill outperforms pure RL methods in both training efficiency and task performance, with optimal trajectories that account for torque limits enhancing trajectory tracking. We successfully transfer our approach to real-world applications.

Robotics

What problem does this paper attempt to address?

### Problems Addressed by the Paper The paper aims to address the challenges faced by humanoid robots when performing diverse motion and manipulation tasks. Specifically, the paper focuses on the following points: 1. **High-dimensional and unstable dynamics**: Humanoid robots possess high-dimensional and unstable dynamic characteristics, making control complex and difficult to achieve precise and stable motion. 2. **Complex contact-rich tasks**: Humanoid robots need to handle various contact-rich tasks, such as carrying heavy objects, climbing stairs, and performing agile skills (e.g., jumping). The complexity of these tasks further increases the difficulty of control. 3. **Limitations of model predictive control**: Model-based optimal control methods, while providing precise and systematic control, have high computational complexity and require accurate contact sensing. 4. **Limitations of reinforcement learning**: Reinforcement learning (RL) methods, although capable of handling high-dimensional spaces and providing robustness, have shortcomings in learning efficiency, natural motion, and the sim-to-real gap. To address these challenges, the paper introduces **Opt2Skill**, an end-to-end pipeline that combines model-based trajectory optimization and reinforcement learning to achieve robust whole-body motion and manipulation for humanoid robots. By generating reference motions and training RL policies to track these trajectories, Opt2Skill outperforms pure RL methods in training efficiency and task performance and has been successfully applied in real-world scenarios.

Opt2Skill: Imitating Dynamically-feasible Whole-Body Trajectories for Versatile Humanoid Loco-Manipulation

Human Demonstration Trajectory Refinement for Redundant Manipulators.

Tra jectory Planning of 7-DOF Humanoid Manipulator under Rapid and Continuous Reaction and Obstacle Avoidance Environment

A Combined Learning and Optimization Framework to Transfer Human Whole-body Loco-manipulation Skills to Mobile Manipulators

Human-Robot Skill Transfer with Enhanced Compliance via Dynamic Movement Primitives

Whole-Body Inverse Kinematics and Operation-Oriented Motion Planning for Robot Mobile Manipulation

Learning Whole-Body Loco-Manipulation for Omni-Directional Task Space Pose Tracking with a Wheeled-Quadrupedal-Manipulator

Deep Reinforcement Learning Based Co-Optimization of Morphology and Gait for Small-Scale Legged Robot

Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning

Deep Imitation Learning for Humanoid Loco-manipulation through Human Teleoperation

HYPERmotion: Learning Hybrid Behavior Planning for Autonomous Loco-manipulation

RL + Model-based Control: Using On-demand Optimal Control to Learn Versatile Legged Locomotion

Learning Visual Quadrupedal Loco-Manipulation from Demonstrations

D21S194, a jump clone from D21S16.

Mobile-TeleVision: Predictive Motion Priors for Humanoid Whole-Body Control

Wheeled Humanoid Bilateral Teleoperation with Position-Force Control Modes for Dynamic Loco-Manipulation

Modeling and reinforcement learning-based locomotion control for a humanoid robot with kinematic loop closures

A Multi-Stage Approach for Efficiently Learning Humanoid Robot Stand-Up Behavior

Versatile Multi-Contact Planning and Control for Legged Loco-Manipulation

Achieving Stable High-Speed Locomotion for Humanoid Robots with Deep Reinforcement Learning

Dynamic Bipedal Loco-manipulation using Oracle Guided Multi-mode Policies with Mode-transition Preference