Abstract:Economic model predictive control (EMPC) is a promising methodology for optimal operation of dynamical processes that has been shown to improve process economics considerably. However, EMPC performance relies heavily on the accuracy of the process model used. As an alternative to model-based control strategies, reinforcement learning (RL) has been investigated as a model-free control methodology, but issues regarding its safety and stability remain an open research challenge. This work presents a novel framework for integrating EMPC and RL for online model parameter estimation of a class of nonlinear systems. In this framework, EMPC optimally operates the closed loop system while maintaining closed loop stability and recursive feasibility. At the same time, to optimize the process, the RL agent continuously compares the measured state of the process with the model's predictions (nominal states), and modifies model parameters accordingly. The major advantage of this framework is its simplicity; state-of-the-art RL algorithms and EMPC schemes can be employed with minimal modifications. The performance of the proposed framework is illustrated on a network of reactions with challenging dynamics and practical significance. This framework allows control, optimization, and model correction to be performed online and continuously, making autonomous reactor operation more attainable.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is to combine economic model predictive control (EMPC) and reinforcement learning (RL) to propose a new framework for achieving the autonomous operation of chemical reactors. Specifically, the paper focuses on the following aspects: 1. **Improving process economy**: Traditional model predictive control (MPC) performs well in optimizing dynamic processes, but its performance depends on the accuracy of the model. Economic model predictive control (EMPC) optimizes the process by directly considering process economics (such as yield or productivity), but also requires accurate model parameters. 2. **Model parameter estimation**: Model parameters in chemical processes change due to factors such as catalyst deactivation, equipment aging, and raw material changes. Therefore, a method is needed to update these parameters in real - time to maintain the accuracy of the model. 3. **Challenges of reinforcement learning**: Although reinforcement learning (RL) as a model - free control method can perform control without an accurate model, it has challenges in terms of safety and stability. The paper proposes a method of combining RL with EMPC, using RL to estimate model parameters while maintaining the stability and recursive feasibility of the system. 4. **Online model parameter estimation**: The paper proposes a framework for online model parameter estimation, in which the RL agent continuously adjusts the model parameters by comparing the actual state of the process with the state predicted by the model. This can achieve real - time model correction, thereby improving control performance and process optimization. ### Specific objectives of the paper - **Combining EMPC and RL**: Propose a new framework that combines EMPC and RL for online model parameter estimation of nonlinear systems. - **Maintaining system stability**: Ensure that while using RL for model parameter estimation, the system can maintain closed - loop stability and recursive feasibility. - **Improving control performance**: Improve control performance and process optimization effects by updating model parameters in real - time, making the autonomous operation of chemical reactors more feasible. ### Key technologies - **Economic model predictive control (EMPC)**: Used to optimize dynamic processes and directly consider process economics. - **Reinforcement learning (RL)**: Used for online estimation of model parameters without the need for an accurate prior model. - **Lyapunov stability theory**: Used to prove the closed - loop stability and recursive feasibility of the system. ### Application example The paper takes a continuous stirred - tank reactor (CSTR) for ethylene oxidation to ethylene oxide as an example to demonstrate the effectiveness of the proposed framework. Through simulation experiments, the feasibility and superiority of this framework in practical applications are verified. ### Conclusion The paper proposes a new framework that combines EMPC and RL, which solves the challenges brought by model parameter changes in traditional control methods and realizes the efficient autonomous operation of chemical reactors. This framework not only improves the economy of the process but also ensures the stability and safety of the system.

A Reinforcement Learning-based Economic Model Predictive Control Framework for Autonomous Operation of Chemical Reactors

Dynamic Modeling And Nonlinear Predictive Control Based On Partitioned Model And Nonlinear Optimization

Adaptive nonlinear model predictive control for a class of multivariable chemical processes

Accelerating Reinforcement Learning with Local Data Enhancement for Process Control

Direct learning of improved control policies from historical plant data

Multistep Criticality Search and Power Shaping in Microreactors with Reinforcement Learning

Predictive Control of Voltage Source Inverter: an Online Reinforcement Learning Solution

Machine learning-based ethylene concentration estimation, real-time optimization and feedback control of an experimental electrochemical reactor

Model-Based Reinforcement Learning Control of Reaction-Diffusion Problems

Control-Informed Reinforcement Learning for Chemical Processes

Accelerating reinforcement learning with case-based model-assisted experience augmentation for process control

A Unified Framework for Online Data-Driven Predictive Control with Robust Safety Guarantees

Encrypted distributed model predictive control of nonlinear processes

End-to-End Reinforcement Learning of Koopman Models for Economic Nonlinear Model Predictive Control

Safe Transfer-Reinforcement-Learning-Based Optimal Control of Nonlinear Systems

Reinforcement Learning Based on Real-Time Iteration NMPC

Online Operational Decision-making for Integrated Electric-Gas Systems with Safe Reinforcement Learning

Robust model predictive control for large-scale distributed parameter systems under uncertainty

PC-Gym: Benchmark Environments For Process Control Problems

Hierarchical Framework for Interpretable and Probabilistic Model-Based Safe Reinforcement Learning

Model Predictive Control and Reinforcement Learning: A Unified Framework Based on Dynamic Programming