iWalker: Imperative Visual Planning for Walking Humanoid Robot

Xiao Lin,Yuhao Huang,Taimeng Fu,Xiaobin Xiong,Chen Wang
2024-10-01
Abstract:Humanoid robots, with the potential to perform a broad range of tasks in environments designed for humans, have been deemed crucial for the basis of general AI agents. When talking about planning and controlling, although traditional models and task-specific methods have been extensively studied over the past few decades, they are inadequate for achieving the flexibility and versatility needed for general autonomy. Learning approaches, especially reinforcement learning, are powerful and popular nowadays, but they are inherently "blind" during training, relying heavily on trials in simulation without proper guidance from physical principles or underlying dynamics. In response, we propose a novel end-to-end pipeline that seamlessly integrates perception, planning, and model-based control for humanoid robot walking. We refer to our method as iWalker, which is driven by imperative learning (IL), a self-supervising neuro-symbolic learning framework. This enables the robot to learn from arbitrary unlabeled data, significantly improving its adaptability and generalization capabilities. In experiments, iWalker demonstrates effectiveness in both simulated and real-world environments, representing a significant advancement toward versatile and autonomous humanoid robots.
Robotics,Systems and Control
What problem does this paper attempt to address?
The problem this paper attempts to address is: how to achieve autonomous navigation of humanoid robots in complex environments, particularly how to establish an efficient, robust, and adaptive end-to-end system between visual perception, path planning, and control. Specifically, the paper points out the following issues with traditional autonomous robot control methods: 1. **Limitations of traditional methods**: Relying on predefined pipelines and manual tuning methods has significant shortcomings in terms of reliability, scalability, and adaptability, especially when applied in different environments, requiring extensive reconfiguration. 2. **Challenges of learning methods**: Although powerful, learning methods based on reinforcement learning lack guidance from physical principles or underlying dynamics during training, leading to insufficient generalization capabilities. 3. **Drawbacks of modular systems**: Separately optimizing perception, planning, and control modules can easily result in suboptimal performance and difficulty adapting to dynamic environments. To address these issues, the paper proposes a new method called iWalker, which achieves improvements through the following ways: - **End-to-end visual-to-control pipeline**: iWalker seamlessly integrates visual perception, path planning, and model-based control, enabling the robot to learn from any unlabelled data, significantly enhancing its adaptability and generalization capabilities. - **Imperative Learning**: Utilizing a Bilevel Optimization (BLO) framework, physical constraints are incorporated into the planning network, improving training efficiency and generalization capabilities. - **Collision maps and Model Predictive Control (MPC)**: By combining collision maps and MPC loss, iWalker can embed physical information during training, ensuring that paths and steps are physically feasible. Experimental results show that iWalker performs excellently in both simulated and real environments, demonstrating its effectiveness and dynamic feasibility in various indoor environments.