An Optimization Method for the Inverted Pendulum Problem Based on Deep Reinforcement Learning

Shenghan Zhu,Song Liu,Siling Feng,Mengxing Huang,Bo Sun
DOI: https://doi.org/10.1088/1742-6596/2296/1/012008
2022-01-01
Journal of Physics Conference Series
Abstract:Abstract The inverted pendulum problem is a classical problem. The inverted pendulum starting at a random position keeps moving upwards and aims to reach an upright position. The problem has been solved through some methods based on deep reinforcement learning (DRL) such as Deep Deterministic Policy Gradient (DDPG). However, DDPG also has disadvantages. Deterministic policy is not conducive to action exploration. Moreover, the Q value needs to be estimated reasonably accurately for the policy to be accurate. Nevertheless, at the beginning of the learning, there is a certain difference in the Q value estimation, and the parameters learned at this time are easy to deviate. Therefore, this paper combining AdaBound with DDPG algorithm proposes an optimization method for the inverted pendulum problem, and compares the performance with that of four published baselines. The experimental results show that for the inverted pendulum problem, the proposed method outperforms the above four baselines to a certain extent.
What problem does this paper attempt to address?