Abstract:Improving the efficiency of deep reinforcement learning for complex systems is a challenging task. In this work, a model-based deep reinforcement learning method named as RBF-ARX (autoregressive model with exogenous inputs and Gaussian radial basis function network-style coefficients) model guided deep reinforcement learning algorithm (RBF-ARX GDRL) is proposed to facilitate the training of optimal controller for control task of continues system, in which RBF-ARX model-based pseudo linear quadratic regulator (PLQR) is introduced in the training process of deep reinforcement learning (DRL). The PLQR is designed based on RBF-ARX model and serves for policy training by providing a gradient component which guides the reinforcement learning in a semi-supervised manner. The actions generated from policy network and the PLQR are evaluated by state-action value networks, and based on those values an adaptive method is proposed to search for the direction and the step-size of policy updates in the training process. According to the relationship between episode return and policy parameters, an anchor-and-trial scheme is proposed to monotonously improve the policy. The training process and the simulation results on stabilizing the single stage inverted pendulum system show that, the proposed method facilitates the training process of the optimal controller applied to the system, and the trained controller achieves higher steady-state and transient performance in the step response experiments, and less energy consumption.

An Optimization Method for the Inverted Pendulum Problem Based on Deep Reinforcement Learning

Solve the Inverted Pendulum Problem Base on DQN Algorithm

Spacecraft Attitude Maneuver Planning Based on Deep Reinforcement Learning under Complex Constraints

A Deep Reinforcement Learning Approach towards Pendulum Swing-up Problem based on TF-Agents

Safe Reinforcement Learning Using Finite-Horizon Gradient-based Estimation

Control of the Double Inverted Pendulum Based on Reinforcement Learning

Guided Deep Reinforcement Learning based on RBF-ARX Pseudo LQR in Single Stage Inverted Pendulum

Balance Controller Design for Inverted Pendulum Considering Detail Reward Function and Two-Phase Learning Protocol

Reentry Trajectory Optimization Based on Deep Reinforcement Learning

Adaptive Control of an Inverted Pendulum by a Reinforcement Learning-based LQR Method

Dueling Network Architecture for Multi-Agent Deep Deterministic Policy Gradient

Improve PID Controller Through Reinforcement Learning

A Hybrid Approach for Reinforcement Learning Using Virtual Policy Gradient for Balancing an Inverted Pendulum

Reward-Adaptive Reinforcement Learning: Dynamic Policy Gradient Optimization for Bipedal Locomotion

A Deep Reinforcement Learning Control Method Guided by RBF-ARX Pseudo LQR

A Deep Reinforcement Learning Method for Lion and Man Problem

Application of Deep Reinforcement Learning Control of an Inverted Hydraulic Pendulum

A Deep Reinforcement Learning Approach to Efficient Distributed Optimization

A Deep Reinforcement Learning Based Efficient Optimization Solution Method for Inverse Kinematics of Hyper-redundant Robot.

Adaptive Primal-Dual Method for Safe Reinforcement Learning

OPTIMIZATION ALGORITHM FOR INTERPLANETARY TRANSFER TRAJECTORIES OF SOLAR SAILCRAFT BASED ON DEEP REINFORCEMENT LEARNING