Abstract:Deep Reinforcement Learning (DRL) techniques have received significant attention in control and decision-making algorithms. Most applications involve complex decision-making systems, justified by the algorithms' computational power and cost. While model-based versions are emerging, model-free DRL approaches are intriguing for their independence from models, yet they remain relatively less explored in terms of performance, particularly in applied control. This study conducts a thorough performance analysis comparing the data-driven DRL paradigm with a classical state feedback controller, both designed based on the same cost (reward) function of the linear quadratic regulator (LQR) problem. Twelve additional performance criteria are introduced to assess the controllers' performance, independent of the LQR problem for which they are designed. Two Deep Deterministic Policy Gradient (DDPG)-based controllers are developed, leveraging DDPG's widespread reputation. These controllers are aimed at addressing a challenging setpoint tracking problem in a Non-Minimum Phase (NMP) system. The performance and robustness of the controllers are assessed in the presence of operational challenges, including disturbance, noise, initial conditions, and model uncertainties. The findings suggest that the DDPG controller demonstrates promising behavior under rigorous test conditions. Nevertheless, further improvements are necessary for the DDPG controller to outperform classical methods in all criteria. While DRL algorithms may excel in complex environments owing to the flexibility in the reward function definition, this paper offers practical insights and a comparison framework specifically designed to evaluate these algorithms within the context of control engineering.

Deterministic Policy Gradient Adaptive Dynamic Programming for Model-Free Optimal Control

Twin Deterministic Policy Gradient Adaptive Dynamic Programming for Optimal Control of Affine Nonlinear Discrete-time Systems

Model-free Adaptive Dynamic Programming for Optimal Control of Discrete-time Affine Nonlinear System

Optimal control of nonlinear system based on deterministic policy gradient with eligibility traces

A Combined Policy Gradient and Q-learning Method for Data-driven Optimal Control Problems

Parallel Cross Entropy Policy Gradient Adaptive Dynamic Programming for Optimal Tracking Control of Discrete-Time Nonlinear Systems

Adaptive Dynamic Programming for Nonaffine Nonlinear Optimal Control Problem with State Constraints

Network Architecture for Optimizing Deep Deterministic Policy Gradient Algorithms

Deterministic policy gradient based optimal control with probabilistic constraints

Costate-Supplement ADP for Model-Free Optimal Control of Discrete-Time Nonlinear Systems

Distributed Optimal Control of Nonlinear System Based on Policy Gradient with External Disturbance

Policy Gradient-based Model Free Optimal LQG Control with a Probabilistic Risk Constraint

Hamiltonian-Driven Adaptive Dynamic Programming With Efficient Experience Replay

Modified general policy iteration based adaptive dynamic programming for unknown discrete‐time linear systems

Model-Free Incremental Adaptive Dynamic Programming Based Approximate Robust Optimal Regulation

Adaptive dynamic programming-based algorithm for infinite-horizon linear quadratic stochastic optimal control problems

Deterministic Policy Gradients with General State Transitions

Model Free Deep Deterministic Policy Gradient Controller for Setpoint Tracking of Non-minimum Phase Systems

Adaptive Dynamic Programming for Minimal Energy Control with Guaranteed Convergence Rate of Linear Systems

A brief survey on nonlinear control using adaptive dynamic programming under engineering-oriented complexities

Deterministic Value-Policy Gradients