On Training Effective Reinforcement Learning Agents for Real-time Power Grid Operation and Control

Ruisheng Diao,Di Shi,Bei Zhang,Siqi Wang,Haifeng Li,Chunlei Xu,Tu Lan,Desong Bian,Jiajun Duan
DOI: https://doi.org/10.48550/arXiv.2012.06458
2020-12-12
Abstract:Deriving fast and effectively coordinated control actions remains a grand challenge affecting the secure and economic operation of today's large-scale power grid. This paper presents a novel artificial intelligence (AI) based methodology to achieve multi-objective real-time power grid control for real-world implementation. State-of-the-art off-policy reinforcement learning (RL) algorithm, soft actor-critic (SAC) is adopted to train AI agents with multi-thread offline training and periodic online training for regulating voltages and transmission losses without violating thermal constraints of lines. A software prototype was developed and deployed in the control center of SGCC Jiangsu Electric Power Company that interacts with their Energy Management System (EMS) every 5 minutes. Massive numerical studies using actual power grid snapshots in the real-time environment verify the effectiveness of the proposed approach. Well-trained SAC agents can learn to provide effective and subsecond control actions in regulating voltage profiles and reducing transmission losses.
Optimization and Control,Systems and Control
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to achieve fast, coordinated and effective control actions in modern large - scale power systems to ensure the safe and economic operation of power systems. Specifically, the paper focuses on how to regulate voltage and reduce transmission losses through real - time optimal control under various security constraints. This challenge stems from the increasing integration of intermittent energy sources, energy storage devices and power electronic devices in power systems. These changes have led to increased randomness and dynamics of grid behavior, affecting the stable and economic operation of power systems. To address the above challenges, the paper proposes an artificial intelligence - based method, in particular, using the latest off - policy reinforcement learning algorithm - Soft Actor - Critic (SAC) - to train artificial intelligence agents capable of achieving real - time multi - objective power system control. This method can not only provide effective control decisions in a simulated environment, but has also been verified in practical applications. For example, the software prototype deployed in the control center of State Grid Jiangsu Electric Power Company in China interacts with the Energy Management System (EMS) once every 5 minutes to provide fast (less than 20 milliseconds) control actions for regulating the voltage profile and reducing transmission losses. The main contributions of the paper are as follows: 1. Formulating the AC Optimal Power Flow (AC OPF) control problem as a Markov Decision Process (MDP), so that reinforcement learning algorithms can be applied to find sub - optimal solutions. 2. Providing a general and flexible framework that can include various control objectives and constraints of power systems when training AI agents. 3. The trained SAC agent can quickly make control actions when abnormal power system states are detected, especially in cases of sudden changes in voltage and line flow. 4. Developing a multi - threaded SAC agent training process and regularly updating the model to ensure long - term control effects and mitigate over - fitting problems.