A Safe Reinforcement Learning Algorithm for Supervisory Control of Power Plants

Yixuan Sun,Sami Khairy,Richard B. Vilim,Rui Hu,Akshay J. Dave

2024-01-24

Abstract:Traditional control theory-based methods require tailored engineering for each system and constant fine-tuning. In power plant control, one often needs to obtain a precise representation of the system dynamics and carefully design the control scheme accordingly. Model-free Reinforcement learning (RL) has emerged as a promising solution for control tasks due to its ability to learn from trial-and-error interactions with the environment. It eliminates the need for explicitly modeling the environment's dynamics, which is potentially inaccurate. However, the direct imposition of state constraints in power plant control raises challenges for standard RL methods. To address this, we propose a chance-constrained RL algorithm based on Proximal Policy Optimization for supervisory control. Our method employs Lagrangian relaxation to convert the constrained optimization problem into an unconstrained objective, where trainable Lagrange multipliers enforce the state constraints. Our approach achieves the smallest distance of violation and violation rate in a load-follow maneuver for an advanced Nuclear Power Plant design.

Systems and Control,Machine Learning

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve This paper aims to address the issue of safe reinforcement learning in the supervisory control of nuclear power plants (NPPs). Specifically, it focuses on: 1. **Proposing a new safe reinforcement learning algorithm**: This algorithm is based on Proximal Policy Optimization (PPO). It transforms the constrained optimization problem into an unconstrained one using the Lagrangian relaxation method and introduces trainable Lagrange multipliers to ensure state constraints. 2. **Creating a physics-based learning environment**: By using a simplified model (SINDYc), an efficient learning environment is constructed to reduce the time required for simulation feedback, thereby accelerating the training process of the reinforcement learning agent. 3. **Implementing supervisory control in advanced nuclear reactors**: By training reinforcement learning agents, the control of advanced nuclear reactors during routine operational transients is achieved, ensuring that system states meet specific constraints, reducing equipment wear, and improving economic benefits. 4. **Achieving optimal performance**: In load-following operations, the proposed model demonstrates optimal control performance, reducing the total power variation by up to 50% compared to traditional methods. Through these methods, the paper aims to develop a reinforcement learning control strategy that can effectively handle complex operational conditions while ensuring safety and efficiency.

A Safe Reinforcement Learning Algorithm for Supervisory Control of Power Plants

Optimal Control for Constrained Discrete-Time Nonlinear Systems Based on Safe Reinforcement Learning.

AdapSafe2: Prior-Free Safe-Certified Reinforcement Learning for Multi-Area Frequency Control

AdapSafe: Adaptive and Safe-Certified Deep Reinforcement Learning-Based Frequency Control for Carbon-Neutral Power Systems.

Safe Reinforcement Learning for Power System Control: A Review

A Review of Safe Reinforcement Learning Methods for Modern Power Systems

Predictive Control of Voltage Source Inverter: an Online Reinforcement Learning Solution

Constrained Reinforcement Learning for Predictive Control in Real-Time Stochastic Dynamic Optimal Power Flow

Computationally Efficient Safe Reinforcement Learning for Power Systems

Possibilities of reinforcement learning for nuclear power plants: Evidence on current applications and beyond

Distributed Deep Reinforcement Learning-based Approach for Fast Preventive Control Considering Transient Stability Constraints

Model-Based Safe Reinforcement Learning with Time-Varying State and Control Constraints: An Application to Intelligent Vehicles

Assessment of Reinforcement Learning Algorithms for Nuclear Power Plant Fuel Optimization

Safety-Enhanced Self-Learning for Optimal Power Converter Control

Multistep Criticality Search and Power Shaping in Microreactors with Reinforcement Learning

Robust Safe Reinforcement Learning Control of Unknown Continuous-Time Nonlinear Systems with State Constraints and Disturbances

Model-Based Safe Reinforcement Learning With Time-Varying Constraints: Applications to Intelligent Vehicles

On Training Effective Reinforcement Learning Agents for Real-time Power Grid Operation and Control

Stability Constrained Reinforcement Learning for Decentralized Real-Time Voltage Control

Safe Reinforcement Learning for Emergency LoadShedding of Power Systems

Optimal Management of Grid-Interactive Efficient Buildings via Safe Reinforcement Learning