Online Actor-Critic Learning for Motion Control of Non-holonomic Mobile Robot

Kai Wang,Yingmin Jia,Junping Du,Fashan Yu
DOI: https://doi.org/10.1007/978-1-4471-2467-2_209
2012-01-01
Abstract:This paper presents a control structure designed for non-holonomic mobile robots by an online algorithm based on policy iteration for learning the continuous-time (CT) optimal control solution with infinite horizon cost. The algorithm learns online in real-time to the solution of Hamilton–Jacobi–Bellman (HJB) equation which has been used for optimal control design. This method finds in real-time suitable approximations of both the optimal cost and control policy, while also guaranteeing closed-loop stability, which implemented as an actor/critic structure involves simultaneous continuous-time adaptation of both actor and critic neural networks (NNs). Simulation examples show the effectiveness of the new algorithm.
What problem does this paper attempt to address?