Two-Critic Deep Reinforcement Learning for Inverter-based Volt-Var Control in Active Distribution Networks
Qiong Liu,Ye Guo,Lirong Deng,Haotian Liu,Dongyu Li,Hongbin Sun,Wenqi Huang
DOI: https://doi.org/10.1109/tste.2024.3376369
IF: 8.31
2024-01-01
IEEE Transactions on Sustainable Energy
Abstract:Inverter-based Volt-VAR control (IB-VVC) can be simplified as a single-period optimization problem. However, recently DRL methods formulate IB-VVC as a Markov decision process (MDP) and solve it as a multi-period optimization problem. It complicates the IB-VVC problem and degrades the performance of deep reinforcement learning (DRL) algorithms. To avoid this, this paper formulates the inverter-based VVC as a one-step MDP and designs a single-period DRL algorithm to solve the problem. It simplifies the DRL approach considerably and accelerates the convergence rate as well as the control performance. Since VVC has two aims: eliminating voltage violations and minimizing power loss, those two objectives have different profiles. Existing DRL methods use one critic neural network to approximate the two objectives together without considering their special property. It increases the approximate difficulty of the critic neural networks in the training process. To alleviate it, we design a two-critic approach. It approximates the two objective functions by two critic neural networks separately. It has a better approximation capability, thus accelerating the convergence rate and improving the control performance of the DRL method further. Based on the single-period two-critic DRL (TC) approach, we design two DRL algorithms: 1) TC-DDPG with deterministic policy and 2) TC-SAC with stochastic policy. Further, we extend the TC-DRL to multi-agent TC to show it scales well for multi-agent DRL algorithms. Simulations conducted on 33-bus and 69-bus test distribution networks demonstrate the superiority of the proposed approach in both single-agent DRL algorithms and multi-agent DRL algorithms. A two-critic deep reinforcement learning (TC-DRL) approach for inverter-based volt-var control (IB-VVC) in active distribution networks is proposed in this paper. Considering two objectives of VVC, minimizing power loss and eliminating voltage violations, have different mathematical properties, we utilize two critics to approximate two objectives separately, which reduces the learning difficulties of each critic. The TC-DRL approach cooperates well with many actor-critic DRL algorithms for the centralized IB-VVC problems, and two centralized DRL algorithms were designed as examples. For decentralized IB-VVC, we extend the approach to a multi-agent TC-DRL approach and further simplify the multi-agent DRL approach with all agents sharing the same centralized two-critic. Extensive simulation experiments show that the proposed two centralized TC-DRL algorithms require fewer iteration times and return better results than the recent DRL algorithms, and the multi-agent TC-DRL algorithms work well for decentralized IB-VVC problems with different limited real-time measurement conditions.
energy & fuels,engineering, electrical & electronic,green & sustainable science & technology