A Safe DRL Method for Fast Solution of Real-Time Optimal Power Flow

Pengfei Wu,Chen Chen,Dexiang Lai,Jian Zhong
2023-08-07
Abstract:High-level penetration of intermittent renewable energy sources (RESs) has introduced significant uncertainties into modern power systems. In order to rapidly and economically respond to the fluctuations of power system operating state, this paper proposes a safe deep reinforcement learning (SDRL) based method for fast solution of real-time optimal power flow (RT-OPF) problems. The proposed method considers the volatility of RESs and temporal constraints, and formulates the RT-OPF as a Constrained Markov Decision Process (CMDP). In the training process, the proposed method hybridizes the proximal policy optimization (PPO) and the primal-dual method. Instead of integrating the constraint violation penalty with the reward function, its actor gradients are estimated by a Lagrange advantage function which is derived from two critic systems based on economic reward and violation cost. The decoupling of reward and cost alleviates reward sparsity while improving critic approximation accuracy. Moreover, the introduction of Lagrange multipliers enables the agent to comprehend the trade-off between optimality and feasibility. Numerical tests are carried out and compared with penalty-based DRL methods on the IEEE 9-bus, 30-bus, and 118-bus test systems. The results show that the well-trained SDRL agent can significantly improve the computation efficiency while satisfying the security constraints and optimality requirements.
Systems and Control
What problem does this paper attempt to address?