Hongbo Zhang,Zhongyu Li,Xuanqi Zeng,Laura Smith,Kyle Stachowicz,Dhruv Shah,Linzhu Yue,Zhitao Song,Weipeng Xia,Sergey Levine,Koushil Sreenath,Yun-hui Liu
Abstract:The enhanced mobility brought by legged locomotion empowers quadrupedal robots to navigate through complex and unstructured environments. However, optimizing agile locomotion while accounting for the varying energy costs of traversing different terrains remains an open challenge. Most previous work focuses on planning trajectories with traversability cost estimation based on human-labeled environmental features. However, this human-centric approach is insufficient because it does not account for the varying capabilities of the robot locomotion controllers over challenging terrains. To address this, we develop a novel traversability estimator in a robot-centric manner, based on the value function of the robot's locomotion controller. This estimator is integrated into a new learning-based RGBD navigation framework. The framework develops a planner that guides the robot in avoiding obstacles and hard-to-traverse terrains while reaching its goals. The training of the navigation planner is directly performed in the real world using a sample efficient reinforcement learning method. Through extensive benchmarking, we demonstrate that the proposed framework achieves the best performance in accurate traversability cost estimation and efficient learning from multi-modal data (the robot's color and depth vision, and proprioceptive feedback) for real-world training. Using the proposed method, a quadrupedal robot learns to perform traversability-aware navigation through trial and error in various real-world environments with challenging terrains that are difficult to classify using depth vision alone.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: How can a quadruped robot, when navigating in complex and unstructured environments, not only recognize and avoid obstacles, but also select the optimal path according to the traversability cost of the terrain. Specifically, the paper aims to develop a learning framework based on visual data, enabling the robot to efficiently learn how to evaluate the traversability cost of different terrains in the real world and plan paths accordingly, thus achieving more stable, energy - efficient, and hardware - maintenance - friendly navigation.
### The core challenges of the problem include:
1. **Estimation of terrain traversability cost**: Traditional traversability estimation methods usually rely on human - labeled environmental features, and these methods do not fully consider the actual motion control performance of the robot on different terrains.
2. **Fusion of multi - modal data**: The robot needs to fuse color images (used to recognize terrain textures), depth images (used to recognize non - traversable areas such as obstacles), and proprioceptive feedback (implicitly encoding physical properties) in real - time to accurately evaluate the traversability of the terrain.
3. **Sample efficiency of reinforcement learning**: In order to train the robot efficiently in the real world, a sample - efficient reinforcement learning method needs to be adopted so that the robot can learn quickly through trial and error within a limited time.
### Overview of the solution:
- **Value - function - based traversability estimator**: Estimate the traversability cost of the terrain by analyzing the value function of the low - level motion controller, thereby reflecting the motion control performance of the robot on different terrains.
- **Hierarchical reinforcement learning framework**: Combine simulation and real - world data, and adopt a multi - level reinforcement learning framework to gradually train the robot's ability from low - level motion control to high - level navigation planning.
- **Online and offline data fusion**: Utilize offline data (such as expert demonstrations) and online data (the robot's interaction data in the real world), and train through sample - efficient reinforcement learning algorithms (such as RLPD), enabling the robot to quickly adapt to new environments.
Through these methods, the paper proposes a brand - new quadruped robot navigation framework with traversability perception ability based on RGBD input, which can efficiently learn and optimize path planning in the real world.