A Reinforcement Learning-based Economic Model Predictive Control Framework for Autonomous Operation of Chemical Reactors

Khalid Alhazmi,Fahad Albalawi,S. Mani Sarathy
DOI: https://doi.org/10.48550/arXiv.2105.02656
2021-05-06
Abstract:Economic model predictive control (EMPC) is a promising methodology for optimal operation of dynamical processes that has been shown to improve process economics considerably. However, EMPC performance relies heavily on the accuracy of the process model used. As an alternative to model-based control strategies, reinforcement learning (RL) has been investigated as a model-free control methodology, but issues regarding its safety and stability remain an open research challenge. This work presents a novel framework for integrating EMPC and RL for online model parameter estimation of a class of nonlinear systems. In this framework, EMPC optimally operates the closed loop system while maintaining closed loop stability and recursive feasibility. At the same time, to optimize the process, the RL agent continuously compares the measured state of the process with the model's predictions (nominal states), and modifies model parameters accordingly. The major advantage of this framework is its simplicity; state-of-the-art RL algorithms and EMPC schemes can be employed with minimal modifications. The performance of the proposed framework is illustrated on a network of reactions with challenging dynamics and practical significance. This framework allows control, optimization, and model correction to be performed online and continuously, making autonomous reactor operation more attainable.
Systems and Control,Machine Learning
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to combine economic model predictive control (EMPC) and reinforcement learning (RL) to propose a new framework for achieving the autonomous operation of chemical reactors. Specifically, the paper focuses on the following aspects: 1. **Improving process economy**: Traditional model predictive control (MPC) performs well in optimizing dynamic processes, but its performance depends on the accuracy of the model. Economic model predictive control (EMPC) optimizes the process by directly considering process economics (such as yield or productivity), but also requires accurate model parameters. 2. **Model parameter estimation**: Model parameters in chemical processes change due to factors such as catalyst deactivation, equipment aging, and raw material changes. Therefore, a method is needed to update these parameters in real - time to maintain the accuracy of the model. 3. **Challenges of reinforcement learning**: Although reinforcement learning (RL) as a model - free control method can perform control without an accurate model, it has challenges in terms of safety and stability. The paper proposes a method of combining RL with EMPC, using RL to estimate model parameters while maintaining the stability and recursive feasibility of the system. 4. **Online model parameter estimation**: The paper proposes a framework for online model parameter estimation, in which the RL agent continuously adjusts the model parameters by comparing the actual state of the process with the state predicted by the model. This can achieve real - time model correction, thereby improving control performance and process optimization. ### Specific objectives of the paper - **Combining EMPC and RL**: Propose a new framework that combines EMPC and RL for online model parameter estimation of nonlinear systems. - **Maintaining system stability**: Ensure that while using RL for model parameter estimation, the system can maintain closed - loop stability and recursive feasibility. - **Improving control performance**: Improve control performance and process optimization effects by updating model parameters in real - time, making the autonomous operation of chemical reactors more feasible. ### Key technologies - **Economic model predictive control (EMPC)**: Used to optimize dynamic processes and directly consider process economics. - **Reinforcement learning (RL)**: Used for online estimation of model parameters without the need for an accurate prior model. - **Lyapunov stability theory**: Used to prove the closed - loop stability and recursive feasibility of the system. ### Application example The paper takes a continuous stirred - tank reactor (CSTR) for ethylene oxidation to ethylene oxide as an example to demonstrate the effectiveness of the proposed framework. Through simulation experiments, the feasibility and superiority of this framework in practical applications are verified. ### Conclusion The paper proposes a new framework that combines EMPC and RL, which solves the challenges brought by model parameter changes in traditional control methods and realizes the efficient autonomous operation of chemical reactors. This framework not only improves the economy of the process but also ensures the stability and safety of the system.