Optimal control of batch processes via a deterministic Q-learning method

Abdelrahman ElMezain,Mohamed Saleh,Jie Zhang,Ahmed Soliman,Seif Fateen
DOI: https://doi.org/10.48550/arXiv.1904.03654
2019-04-14
Abstract:Dynamic optimization of nonlinear chemical systems -- such as batch reactors -- should be applied online, and the suitable control taken should be according to the current state of the system rather than the current time instant. The recent state of the art methods applies the control based on the current time instant only. This is not suitable for most cases, as it is not robust to possible changes in the system. This paper proposes a Deterministic Q-Learning method to conduct robust online optimization of batch reactors. In this paper, the Q-Learning method is applied on simple batch reactor models; and in order to show the effectiveness of the proposed method the results are compared to other dynamic optimization methods. The main advantage of the Q-learning method or the proposed method is that it can accommodate unplanned changes during the process via changing the control action; i.e. the main advantage of the proposed method that it can overcome sudden changes during the reaction. In general, we try to maximize the final product obtained or meet certain specifications of the products (e.g. minimize side products).
Systems and Control
What problem does this paper attempt to address?