Abstract:Humanoid robots that can autonomously operate in diverse environments have the potential to help address labour shortages in factories, assist elderly at homes, and colonize new planets. While classical controllers for humanoid robots have shown impressive results in a number of settings, they are challenging to generalize and adapt to new environments. Here, we present a fully learning-based approach for real-world humanoid locomotion. Our controller is a causal transformer that takes the history of proprioceptive observations and actions as input and predicts the next action. We hypothesize that the observation-action history contains useful information about the world that a powerful transformer model can use to adapt its behavior in-context, without updating its weights. We train our model with large-scale model-free reinforcement learning on an ensemble of randomized environments in simulation and deploy it to the real world zero-shot. Our controller can walk over various outdoor terrains, is robust to external disturbances, and can adapt in context.

What problem does this paper attempt to address?

The paper aims to address the practical walking control problem of full-sized humanoid robots. Specifically, the research team proposes a fully learning-based approach to achieve locomotion for humanoid robots in the real world. They use a causal transformer as the controller, which can predict the next action based on past sensory observations and action history. The main objectives of this method include: 1. **Addressing the limitations of traditional controllers**: Traditional humanoid robot controllers, while performing well in certain scenarios, face challenges in adapting to new environments. Therefore, the researchers aim to improve the robot's adaptability and generalization capabilities through a learning-based controller. 2. **Achieving stable walking in diverse environments**: By training with large-scale model-free reinforcement learning, the robot can walk stably in various outdoor environments, including plazas, sidewalks, tracks, and grass. 3. **Adapting to external disturbances and different loads**: The robot needs to maintain balance when subjected to external force disturbances and be able to walk while carrying loads of different weights and shapes. 4. **No additional sensors required**: The controller used in this study relies solely on past sensory observations and action history, without using additional sensors such as cameras. 5. **Zero-shot deployment**: After extensive training in a simulated environment, the controller can be directly deployed in the real-world environment without further adjustments or fine-tuning. Additionally, the research demonstrates some interesting characteristics of the controller, such as natural walking behavior, coordinated arm swinging, and adaptive behavior changes on different terrains. These behaviors emerge naturally through the training process and are not pre-programmed. Overall, the study proves that a simple and general learning-based controller can achieve complex, high-dimensional humanoid robot control in the physical world.

Real-World Humanoid Locomotion with Reinforcement Learning

Real-world humanoid locomotion with reinforcement learning

Learning Humanoid Locomotion over Challenging Terrain

Humanoid Locomotion as Next Token Prediction

Hierarchical World Models as Visual Whole-Body Humanoid Controllers

Learning Bipedal Walking for Humanoids with Current Feedback

Advancing Humanoid Locomotion: Mastering Challenging Terrains with Denoising World Model Learning

Whole-body Humanoid Robot Locomotion with Human Reference

Lifelike Agility and Play in Quadrupedal Robots using Reinforcement Learning and Generative Pre-trained Models

Learning Generic and Dynamic Locomotion of Humanoids Across Discrete Terrains

Real-World Human-Robot Collaborative Reinforcement Learning

Berkeley Humanoid: A Research Platform for Learning-based Control

Words into Action: Learning Diverse Humanoid Robot Behaviors using Language Guided Iterative Motion Refinement

Learning Human-to-Humanoid Real-Time Whole-Body Teleoperation

Dexterous Legged Locomotion in Confined 3D Spaces with Reinforcement Learning

Learning Bipedal Walking On Planned Footsteps For Humanoid Robots

Reinforcement Learning for Versatile, Dynamic, and Robust Bipedal Locomotion Control

Achieving Stable High-Speed Locomotion for Humanoid Robots with Deep Reinforcement Learning

iWalker: Imperative Visual Planning for Walking Humanoid Robot

Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers