Abstract:The goethite iron-removal process is an important procedure to remove the iron ions from the zinc hydrometallurgy. However, as a coherent system with complex reaction mechanism, associated uncertainties, and interconnected adjacent reactors, it is difficult for the process to accurately control the ion concentration. Because a large amount of historical data can be obtained during the process, an optimal control algorithm based on off-policy reinforcement learning is proposed in this paper to overcome these difficulties. According to the historical data, the weights of neural network are learned offline, and the optimal control strategy is solved online. Firstly, a bounded function is introduced to define the maximum effect of the coherent system on the subsystem cost function and to extend the cost function of the nominal system, so that the decentralized guaranteed cost control problem can be expressed as the optimal control problem of the nominal system. Then, an approximate iterative control algorithm based on actor-critic structure is proposed. The actor and critic neural networks are used to approximate control strategies and cost functions respectively. To achieve complete off-line, a new neural network is added to the actor-critic structure to approximate a part of the unknown system structure, and the three neural network parameters are optimized by the state transition algorithm. Finally, the strategy update and strategy iteration operations are performed alternately to learn optimal control strategies. The effectiveness and flexibility of the proposed off-policy optimal control method is validated by data from a real industrial goethite iron-removal process.

Offline Constrained Reinforcement Learning for Batch-to-batch Optimization of Cobalt Oxalate Synthesis Process

Behavior Proximal Policy Optimization

Successive Convex Approximation Based Off-Policy Optimization for Constrained Reinforcement Learning

Hierarchical batch-to-batch optimization of cobalt oxalate synthesis process based on data-driven model

Batch-to-batch optimization of cobalt oxalate synthesis process using modifier-adaptation strategy with latent variable model

Active Learning Strategy for Online Prediction of Particle Size Distribution in Cobalt Oxalate Synthesis Process.

Online Quality Prediction for Cobalt Oxalate Synthesis Process Using Least Squares Support Vector Regression Approach with Dual Updating

Batch-to-batch control of particle size distribution in cobalt oxalate synthesis process based on hybrid model

Hierarchical-linked Batch-to-batch Optimization Based on Transfer Learning of Synthesis Process

Partial Least Squares Model-Based Batch-To-Batch Control Of Particle Size Distribution In Cobalt Oxalate Synthesis Process

Real-time Optimization for Chemical Processes Based on On-Line Modeling of Controlled Variables

Safe reinforcement learning for industrial optimal control: A case study from metallurgical industry.

Optimal Control of Iron-Removal Systems Based on Off-Policy Reinforcement Learning

Optimal control of batch processes via a deterministic Q-learning method

POCE: Primal Policy Optimization with Conservative Estimation for Multi-constraint Offline Reinforcement Learning

Online Tuning for Offline Decentralized Multi-Agent Reinforcement Learning

Efficient Cobalt Oxalate Synthesis Process Optimization Via Second‐order Modifier Adaptation with Transfer Learning

Multi-objective reinforcement learning for fed-batch fermentation process control

Batch reinforcement learning based dynamic optimization for polyethylene grade transitions

Learning to Optimize: Reference Vector Reinforcement Learning Adaption to Constrained Many-Objective Optimization of Industrial Copper Burdening System

Offline Reinforcement Learning for Optimizing Production Bidding Policies