Learning Push Recovery Behaviors for Humanoid Walking Using Deep Reinforcement Learning

Dicksiano C. Melo,Marcos R. O. A. Maximo,Adilson Marques da Cunha
DOI: https://doi.org/10.1007/s10846-022-01656-7
2022-08-21
Journal of Intelligent and Robotic Systems: Theory and Applications
Abstract:The development of a robust and versatile biped walking engine might be considered one of the hardest problems in Mobile Robotics. Even well-developed cities contains obstacles that make the navigation of these agents without a human assistance infeasible. Therefore, it is primordial that they be able to restore dynamically their own balance when subject to certain types of external disturbances. Thereby, this article contributes with a implementation of a Push Recovery controller that improves the walking engine's performance used by a simulated humanoid agent from RoboCup 3D Soccer Simulation League environment. This work applies Proximal Policy Optimization in order to learn a movement policy in this simulator. Our learned policy was able to surpass the baselines with statistical significance. Finally, we propose two approaches based on Transfer Learning and Imitation Learning to achieve a final policy which performs well across an wide range disturbance directions.
What problem does this paper attempt to address?