A Heuristics-Based Reinforcement Learning Method to Control Bipedal Robots

Daoling Qin,Guoteng Zhang,Zhengguo Zhu,Teng Chen,Weiliang Zhu,Xuewen Rong,Anhuan Xie,Yibin Li
DOI: https://doi.org/10.1142/s0219843623500135
2024-01-01
International Journal of Humanoid Robotics
Abstract:A new method is proposed to control bipedal robots to achieve flexible omni-directional motion and robust locomotion under complex disturbances, called the heuristics-based reinforcement learning (HBRL) framework. HBRL shows great training efficiency in simulation. Heuristic reference trajectories play a crucial role in HBRL, which guide the training process. Exploration rewards, leg-foot reset condition, and command curriculum are three significant components to optimize the training process. An estimator network is utilized to supply linear velocities and foot contact information. We train controllers on flat ground in simulation. To demonstrate robustness and versatility, the trained controllers were tested on BRAVER, a point-foot bipedal robot with three joints on each leg. The controllers enabled BRAVER to perform omni-directional locomotion with the maximum forward speed reaching 2[Formula: see text]m/s. The robot could also maintain balance under external pushing and over uneven terrains.
What problem does this paper attempt to address?