Abstract:Accurate state estimation plays a critical role in ensuring the robust control of humanoid robots, particularly in the context of learning-based control policies for legged robots. However, there is a notable gap in analytical research concerning estimations. Therefore, we endeavor to further understand how various types of estimations influence the decision-making processes of policies. In this paper, we provide quantitative insight into the effectiveness of learned state estimations, employing saliency analysis to identify key estimation variables and optimize their combination for humanoid locomotion tasks. Evaluations assessing tracking precision and robustness are conducted on comparative groups of policies with varying estimation combinations in both simulated and real-world environments. Results validated that the proposed policy is capable of crossing the sim-to-real gap and demonstrating superior performance relative to alternative policy configurations.

What problem does this paper attempt to address?

The paper primarily aims to address the following issues: 1. **Understanding the role of key state estimation in reinforcement learning control strategies**: Researchers attempt to delve into how different types of estimations influence the decision-making process and evaluate the effectiveness of learned state estimations through quantitative analysis. 2. **Determining the optimal combination of estimation variables**: By identifying key estimation variables through significance analysis and optimizing their combination, the performance of humanoid robot walking tasks can be enhanced. 3. **Designing a highly adaptable learning framework**: Proposing a controllable and highly adaptable framework based on an asymmetric actor-critic structure for learning the walking capabilities of humanoid robots. Specifically, the researchers focus on the following aspects: - **Methodology**: Using an asymmetric actor-critic structure to train strategies, where the actor strategy can only access 0.5 seconds of historical observation data (including delayed and noisy proprioceptive information and commands), while the critic strategy can access all types of system states. - **State and action definitions**: States are divided into observations, privileged information, and commands, with detailed definitions of the specific content of each type of state. - **Reward design**: Introducing a bell-shaped kernel function into the reward design to encourage strategies to survive in complex environments. - **Significance analysis**: Utilizing the integrated gradients method from explainable artificial intelligence for significance analysis to quantify the importance of different estimation variables. - **Experimental setup and results**: Evaluating the effectiveness of different estimation strategies through a series of simulations and real-world experiments, verifying that strategies containing the most relevant estimation variables achieve the best overall performance. The main contributions of the paper include: - Conducting a quantitative analysis of how estimation variables affect the performance of learning strategies and proposing the optimal combination of estimation variables. - Proposing a controllable and highly adaptable humanoid robot walking learning framework based on an asymmetric actor-critic structure. - Testing the proposed framework and estimation methods in the real world and demonstrating their adaptability to outdoor environments. In summary, this paper aims to provide theoretical basis and technical support for enhancing the stability and adaptability of humanoid robots in complex environments through in-depth analysis and experimental evaluation.

Toward Understanding Key Estimation in Learning Robust Humanoid Locomotion

Invariant EKF based State Estimator for Quadruped Robots

Learning Accurate and Robust Velocity Tracking for Quadrupedal Robots

Learning Gait-conditioned Bipedal Locomotion with Motor Adaptation

Adaptive Robust Invariant Extended Kalman filtering for Biped Robot.

State Estimation for a Position-Controlled Biped Humanoid Robot Using Simple Models

Proprioceptive State Estimation of Legged Robots with Kinematic Chain Modeling

OptiState: State Estimation of Legged Robots using Gated Networks with Transformer-based Vision and Kalman Filtering

State Estimation for a Humanoid Robot

Legged Robot State-Estimation Through Combined Forward Kinematic and Preintegrated Contact Factors

Legged Robot State Estimation within Non-inertial Environments

Fast Decentralized State Estimation for Legged Robot Locomotion via EKF and MHE

Robust Legged Robot State Estimation Using Factor Graph Optimization

LIKO: LiDAR, Inertial, and Kinematic Odometry for Bipedal Robots

State Estimation for Human Motion and Humanoid Locomotion

Advancing Robust State Estimation of Wheeled Robots in Degenerate Environments: Harnessing Ground Manifold and Motion States

State Estimation Transformers for Agile Legged Locomotion

Body State Estimation in a Quasi-Passive Bipedal Robot During Dynamic Walking

Online Learning-Based Inertial Parameter Identification of Unknown Object for Model-Based Control of Wheeled Humanoids

Learning Inertial Odometry for Dynamic Legged Robot State Estimation

State Estimation for Legged Robots Using Contact-Centric Leg Odometry