Robust N-1 secure HV Grid Flexibility Estimation for TSO-DSO coordinated Congestion Management with Deep Reinforcement Learning

Zhenqi Wang,Sebastian Wende-von Berg,Martin Braun
DOI: https://doi.org/10.48550/arXiv.2211.05855
2022-12-18
Abstract:Nowadays, the PQ flexibility from the distributed energy resources (DERs) in the high voltage (HV) grids plays a more critical and significant role in grid congestion management in TSO grids. This work proposed a multi-stage deep reinforcement learning approach to estimate the PQ flexibility (PQ area) at the TSO-DSO interfaces and identifies the DER PQ setpoints for each operating point in a way, that DERs in the meshed HV grid can be coordinated to offer flexibility for the transmission grid. In the estimation process, we consider the steady-state grid limits and the robustness in the resulting voltage profile against uncertainties and the N-1 security criterion regarding thermal line loading, essential for real-life grid operational planning applications. Using deep reinforcement learning (DRL) for PQ flexibility estimation is the first of its kind. Furthermore, our approach of considering N-1 security criterion for meshed grids and robustness against uncertainty directly in the optimization tasks offers a new perspective besides the common relaxation schema in finding a solution with mathematical optimal power flow (OPF). Finally, significant improvements in the computational efficiency in estimation PQ area are the highlights of the proposed method.
Systems and Control,Artificial Intelligence
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve The paper aims to address the issue of estimating the active and reactive power (PQ) flexibility of distributed energy resources (DERs) in high voltage (HV) grids, particularly when coordinating grid congestion management between transmission system operators (TSO) and distribution system operators (DSO). Specifically, the paper proposes a multi-stage deep reinforcement learning method to estimate PQ flexibility at the TSO-DSO interface and determine the DER PQ setpoints at each operating point, enabling the interconnected HV grid to provide coordinated flexibility. ### Main Challenges 1. **N-1 Security Criterion**: Considering the N-1 security criterion in mathematical optimization remains a challenge. The N-1 criterion requires that the failure of a single HV line should not overload other parts of the system. 2. **Uncertainty Handling**: Uncertainties in operational planning (such as wind and solar forecast errors, aggregated HV/MV load variations) can lead to errors in PQ flexibility estimation, and some setpoints may violate grid constraints. 3. **Computational Efficiency**: Traditional mathematical optimization methods have high computational overhead when dealing with large-scale HV grids, making it difficult to meet the needs of real-world grid operations. ### Solutions 1. **Deep Reinforcement Learning (DRL)**: The paper applies deep reinforcement learning to PQ flexibility estimation for the first time, using artificial neural networks (ANN) to predict DER setpoints, thereby improving computational efficiency. 2. **N-1 Constraint Approximation**: A supervised training approximation method is proposed to predict N-1 constraint violations, thus considering the N-1 security criterion during training. 3. **Voltage Distribution Probability Approximation**: Monte Carlo simulation (MCS) and probabilistic power flow (PPF) methods are used to approximate voltage distribution probabilities, enhancing the robustness of the optimization. 4. **Post-Processing Steps**: Feasible PQ regions are filtered through post-processing steps to ensure that the predicted setpoints comply with grid constraints. ### Experimental Validation The paper validates the approach using the SimBench HV grid dataset, with results showing: - **Approximator Performance**: The N-1 approximator and voltage distribution probability approximator perform well under "on-sample" training, accurately identifying constraint violations. - **Flexibility Optimality**: Compared to traditional mathematical optimization methods (such as PIPS), ANN-OPF demonstrates better performance in most time steps, especially in optimizing maximum/minimum reactive power. - **Computational Efficiency**: The prediction and post-processing steps of ANN-OPF can be completed within 1 second, significantly outperforming the computation time of traditional methods. ### Conclusion The paper proposes a novel deep reinforcement learning-based method (ANN-OPF) for estimating PQ flexibility at the TSO-DSO interface. This method considers the N-1 security criterion and voltage distribution robustness through supervised training approximators, ensuring the feasibility of the predicted PQ regions while achieving high computational efficiency. Future research will further explore the applicability of this method under different topological changes.