A Huber reward function-driven deep reinforcement learning solution for cart-pole balancing problem

Shaili Mishra,Anuja Arora
DOI: https://doi.org/10.1007/s00521-022-07606-6
2022-07-30
Neural Computing and Applications
Abstract:Lots of learning tasks require experience learning based on activities performed in real scenarios which are affected by environmental factors. Therefore, real-time systems demand a model to learn from working experience—such as physical object properties-driven system models, trajectory prediction, and Atari games. This experience-driven learning model uses reinforcement learning which is considered as an important research topic and needs problem-specific reasoning model simulation. In this research paper, cart-pole balancing problem is selected as a problem where the system learns using Q-learning and Deep Q network reinforcement learning approaches. Pragmatic foundation of cart-pole problem and its solution with the help of Q learning and DQN reinforcement learning model are validated, and a comparison of achieved outcome in the form of accuracy and fast convergence is presented. An unexperienced Huber loss function is applied on cart-pole balancing problem, and results are in favor of Huber loss function in comparison with mean-squared error loss function. Hence, experimental study suggests the use of DQN with Huber loss reward function for fast learning and convergence of cart pole in balanced condition.
computer science, artificial intelligence
What problem does this paper attempt to address?