World Model-based Perception for Visual Legged Locomotion

Hang Lai,Jiahang Cao,Jiafeng Xu,Hongtao Wu,Yunfeng Lin,Tao Kong,Yong Yu,Weinan Zhang

2024-09-25

Abstract:Legged locomotion over various terrains is challenging and requires precise perception of the robot and its surroundings from both proprioception and vision. However, learning directly from high-dimensional visual input is often data-inefficient and intricate. To address this issue, traditional methods attempt to learn a teacher policy with access to privileged information first and then learn a student policy to imitate the teacher's behavior with visual input. Despite some progress, this imitation framework prevents the student policy from achieving optimal performance due to the information gap between inputs. Furthermore, the learning process is unnatural since animals intuitively learn to traverse different terrains based on their understanding of the world without privileged knowledge. Inspired by this natural ability, we propose a simple yet effective method, World Model-based Perception (WMP), which builds a world model of the environment and learns a policy based on the world model. We illustrate that though completely trained in simulation, the world model can make accurate predictions of real-world trajectories, thus providing informative signals for the policy controller. Extensive simulated and real-world experiments demonstrate that WMP outperforms state-of-the-art baselines in traversability and robustness. Videos and Code are available at: <a class="link-external link-https" href="https://wmp-loco.github.io/" rel="external noopener nofollow">this https URL</a>.

Robotics,Machine Learning

What problem does this paper attempt to address?

The paper aims to address challenging issues in visual legged locomotion, particularly how to utilize visual information for precise perception and efficient learning. Specifically, the paper proposes a new framework called World Model-based Perception (WMP) to overcome some limitations of existing methods. Traditional methods typically adopt a privileged learning framework, where a teacher policy that has access to low-dimensional privileged information (such as scan points around the robot) is first trained, and then a student policy based on visual input is trained to mimic the teacher's behavior. However, this approach has the following problems: 1. **Performance Gap**: Due to generalization errors, the student policy finds it difficult to fully replicate the performance of the teacher policy. 2. **Design Complexity**: The teacher policy requires access to various types of additional information, which is challenging to achieve in practical applications. 3. **Data Inefficiency**: Learning policies directly from high-dimensional pixel inputs is very data-intensive, and for forward-facing cameras, the policy needs to remember past perception information to predict the terrain under the robot's feet. WMP extracts useful information by constructing a world model of the environment and learns policies based on this model. This approach not only avoids the limitations of privileged learning but also naturally compresses a series of high-dimensional perceptions into meaningful representations to aid decision-making. Experimental results show that WMP outperforms existing baseline methods on various terrains and can successfully handle complex terrains in real-world environments.

World Model-based Perception for Visual Legged Locomotion

Learning Gait-conditioned Bipedal Locomotion with Motor Adaptation

Learning Robust, Agile, Natural Legged Locomotion Skills in the Wild

Learning robust perceptive locomotion for quadrupedal robots in the wild

Learning Humanoid Locomotion with Perceptive Internal Model

Traversability-Aware Legged Navigation by Learning from Real-World Visual Data

CTS: Concurrent Teacher-Student Reinforcement Learning for Legged Locomotion

Walking with Terrain Reconstruction: Learning to Traverse Risky Sparse Footholds

Hybrid Internal Model: Learning Agile Legged Locomotion with Simulated Robot Response

Online Learning of Unknown Dynamics for Model-Based Controllers in Legged Locomotion

PA-LOCO: Learning Perturbation-Adaptive Locomotion for Quadruped Robots

Experience-Learning Inspired Two-Step Reward Method for Efficient Legged Locomotion Learning Towards Natural and Robust Gaits

Learning to enhance multi-legged robot on rugged landscapes

Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers

Learning to walk in confined spaces using 3D representation

Learning to See Physical Properties with Active Sensing Motor Policies

Learning Semantics-Aware Locomotion Skills from Human Demonstration

Hierarchical Vision Navigation System for Quadruped Robots with Foothold Adaptation Learning

Terrain-Aware Quadrupedal Locomotion via Reinforcement Learning

Learning World Transition Model for Socially Aware Robot Navigation

ZSL-RPPO: Zero-Shot Learning for Quadrupedal Locomotion in Challenging Terrains using Recurrent Proximal Policy Optimization