Abstract:Reinforcement learning (RL) is an artificial intelligence algorithm that can learn adaptive optimal control law online. In view of the fact that the previous control approaches were usually overly dependent on the model parameters of system, and most existing RL methods are based on state feedback, their application in actual industrial production is limited. Additionally, developing accurate process system models and ensuring the closed-loop system’s control performance is more challenging, as modern businesses place a premium on product quality and economic efficiency. As a result, this work introduces a novel data-driven two-dimensional (2D) off-policy Q -learning method based on output feedback is used to achieve optimal tracking control for batch process. First, the error between the actual output and the given set-point is extended to the system to ensure the good tracking performance. Second, by analyzing the relationship between the value function and the Q -function obtained from the 2D system’s performance index, the 2D Bellman equation is obtained in terms of output feedback that is independent of the model parameters. The optimal control problem can be effectively solved by the proposed method in this paper when the policy iteration is executed using only the measurement data of system along the batch and time directions. Following that, the proposed approach’s unbiasedness and convergence are strictly confirmed. Finally, the simulation results for the injection molding process demonstrate that the proposed method is capable of determining the optimal control law as the number of batches is growing increasingly.

Optimal Tracking Control of Nonlinear Batch Processes with Unknown Dynamics Using Two-Dimensional Off-Policy Interleaved Q-learning Algorithm

Novel data-driven two-dimensional Q-learning for optimal tracking control of batch process with unknown dynamics

Optimal tracking control of batch processes with time-invariant state delay: Adaptive Q-learning with two-dimensional state and control policy

Novel two-dimensional off-policy Q -learning method for output feedback optimal tracking control of batch process with unknown dynamics

A Learning-Based Optimal Tracking Controller for Continuous Linear Systems with Unknown Dynamics: Theory and Case Study

Off-policy two-dimensional reinforcement learning for optimal tracking control of batch processes with network-induced dropout and disturbances

Data-Efficient Off-Policy Learning for Distributed Optimal Tracking Control of HMAS with Unidentified Exosystem Dynamics.

Model-Free Optimal Tracking Design With Evolving Control Strategies via Q-Learning

Quadratic Tracking Control of Linear Stochastic Systems with Unknown Dynamics Using Average Off-Policy Q-Learning Method

Control of Nonaffine Nonlinear Discrete-Time Systems Using Reinforcement-Learning-Based Linearly Parameterized Neural Networks

Adaptive Learning-Based Path-Tracking Control for Unknown Vehicle Systems under Performance Optimization

Two-dimensional model-free Q-learning-based output feedback fault-tolerant control for batch processes

Online reinforcement learning control of unknown nonaffine nonlinear discrete time systems

Human-in-the-loop Distributed Cooperative Tracking Control with Applications to Autonomous Ground Vehicles: A Data-Driven Mixed Iteration Approach

A Combined Policy Gradient and Q-learning Method for Data-driven Optimal Control Problems

Optimal Tracking Control of Nonlinear Multiagent Systems Using Internal Reinforce Q-Learning

Reinforcement Learning-Based Control for Nonlinear Discrete-Time Systems with Unknown Control Directions and Control Constraints

Dynamical Hyperparameter Optimization Via Deep Reinforcement Learning in Tracking

Asynchronous iterative Q-learning based tracking control for nonlinear discrete-time multi-agent systems

Robust Optimal Parallel Tracking Control Based on Adaptive Dynamic Programming

The Adaptive Optimal Output Feedback Tracking Control of Unknown Discrete-Time Linear Systems Using a Multistep Q-Learning Approach