Abstract:Nowadays, the PQ flexibility from the distributed energy resources (DERs) in the high voltage (HV) grids plays a more critical and significant role in grid congestion management in TSO grids. This work proposed a multi-stage deep reinforcement learning approach to estimate the PQ flexibility (PQ area) at the TSO-DSO interfaces and identifies the DER PQ setpoints for each operating point in a way, that DERs in the meshed HV grid can be coordinated to offer flexibility for the transmission grid. In the estimation process, we consider the steady-state grid limits and the robustness in the resulting voltage profile against uncertainties and the N-1 security criterion regarding thermal line loading, essential for real-life grid operational planning applications. Using deep reinforcement learning (DRL) for PQ flexibility estimation is the first of its kind. Furthermore, our approach of considering N-1 security criterion for meshed grids and robustness against uncertainty directly in the optimization tasks offers a new perspective besides the common relaxation schema in finding a solution with mathematical optimal power flow (OPF). Finally, significant improvements in the computational efficiency in estimation PQ area are the highlights of the proposed method.

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve The paper aims to address the issue of estimating the active and reactive power (PQ) flexibility of distributed energy resources (DERs) in high voltage (HV) grids, particularly when coordinating grid congestion management between transmission system operators (TSO) and distribution system operators (DSO). Specifically, the paper proposes a multi-stage deep reinforcement learning method to estimate PQ flexibility at the TSO-DSO interface and determine the DER PQ setpoints at each operating point, enabling the interconnected HV grid to provide coordinated flexibility. ### Main Challenges 1. **N-1 Security Criterion**: Considering the N-1 security criterion in mathematical optimization remains a challenge. The N-1 criterion requires that the failure of a single HV line should not overload other parts of the system. 2. **Uncertainty Handling**: Uncertainties in operational planning (such as wind and solar forecast errors, aggregated HV/MV load variations) can lead to errors in PQ flexibility estimation, and some setpoints may violate grid constraints. 3. **Computational Efficiency**: Traditional mathematical optimization methods have high computational overhead when dealing with large-scale HV grids, making it difficult to meet the needs of real-world grid operations. ### Solutions 1. **Deep Reinforcement Learning (DRL)**: The paper applies deep reinforcement learning to PQ flexibility estimation for the first time, using artificial neural networks (ANN) to predict DER setpoints, thereby improving computational efficiency. 2. **N-1 Constraint Approximation**: A supervised training approximation method is proposed to predict N-1 constraint violations, thus considering the N-1 security criterion during training. 3. **Voltage Distribution Probability Approximation**: Monte Carlo simulation (MCS) and probabilistic power flow (PPF) methods are used to approximate voltage distribution probabilities, enhancing the robustness of the optimization. 4. **Post-Processing Steps**: Feasible PQ regions are filtered through post-processing steps to ensure that the predicted setpoints comply with grid constraints. ### Experimental Validation The paper validates the approach using the SimBench HV grid dataset, with results showing: - **Approximator Performance**: The N-1 approximator and voltage distribution probability approximator perform well under "on-sample" training, accurately identifying constraint violations. - **Flexibility Optimality**: Compared to traditional mathematical optimization methods (such as PIPS), ANN-OPF demonstrates better performance in most time steps, especially in optimizing maximum/minimum reactive power. - **Computational Efficiency**: The prediction and post-processing steps of ANN-OPF can be completed within 1 second, significantly outperforming the computation time of traditional methods. ### Conclusion The paper proposes a novel deep reinforcement learning-based method (ANN-OPF) for estimating PQ flexibility at the TSO-DSO interface. This method considers the N-1 security criterion and voltage distribution robustness through supervised training approximators, ensuring the feasibility of the predicted PQ regions while achieving high computational efficiency. Future research will further explore the applicability of this method under different topological changes.

Robust N-1 secure HV Grid Flexibility Estimation for TSO-DSO coordinated Congestion Management with Deep Reinforcement Learning

Efficient distribution grid flexibility provision through model-based MV grid and model-less LV grid approach

Fast Mapping of Flexibility Regions at TSO-DSO Interfaces under Uncertainty

Risk-aware Flexible Resource Utilization in an Unbalanced Three-Phase Distribution Network using SDP-based Distributionally Robust Optimal Power Flow

A Holistic Power Optimization Approach for Microgrid Control Based on Deep Reinforcement Learning

Aggregated distribution grid flexibilities in subtransmission grid operational management

Double Deep Q-learning Based Real-Time Optimization Strategy for Microgrids

Quantum Reinforcement Learning-Based Two-Stage Unit Commitment Framework for Enhanced Power Systems Robustness

Deep reinforcement learning approach to estimate the energy-mix proportion for secure operation of converter-dominated power system

Distribution Grid Robust Operation under Forecast Uncertainties with Flexibility Estimation from Low Voltage Grids using a Monitoring and Control Equipment

Optimizing Load Scheduling in Power Grids Using Reinforcement Learning and Markov Decision Processes

Deep Reinforcement Learning for Voltage Control and Renewable Accommodation Using Spatial-Temporal Graph Information

Online EVs Vehicle-to-Grid Scheduling Coordinated with Multi-Energy Microgrids: A Deep Reinforcement Learning-Based Approach

Robust Federated Deep Reinforcement Learning for Optimal Control in Multiple Virtual Power Plants with Electric Vehicles

A Safe DRL Method for Fast Solution of Real-Time Optimal Power Flow

Multi-agent DRL-based Data-Driven Approach for PEVs Charging/discharging Scheduling in Smart Grid.

Flexibility Prediction of Aggregated Electric Vehicles and Domestic Hot Water Systems in Smart Grids

Deep Reinforcement Learning-Based Method for Joint Optimization of Mobile Energy Storage Systems and Power Grid with High Renewable Energy Sources

Robust Deep Reinforcement Learning for Volt-VAR Optimization in Active Distribution System under Uncertainty

Distributed Deep Reinforcement Learning-based Approach for Fast Preventive Control Considering Transient Stability Constraints