Enhanced Oblique Decision Tree Enabled Policy Extraction for Deep Reinforcement Learning in Power System Emergency Control
Yuxin Dai,Qimei Chen,Jun Zhang,Xiaohui Wang,Yilin Chen,Tianlu Gao,Peidong Xu,Siyuan Chen,Siyang Liao,Huaiguang Jiang,David Wen-zhong Gao
DOI: https://doi.org/10.1016/j.epsr.2022.107932
IF: 3.818
2022-08-01
Electric Power Systems Research
Abstract:Deep reinforcement learning (DRL) algorithms have successfully solved many challenging problems in various power system control scenarios. However, their decision-making process is usually regarded as black-boxes. Furthermore, how DRL models interact with human intelligence remains an open problem. Thus, this paper proposes a policy extraction framework to extract a complex DRL model into an explainable policy. This framework includes three parts: 1) DRL training and data generation. We train an agent for a specific control task and generate data, which contains the control policy of the agent. 2) Policy extraction. We propose an information gain rate based weighted oblique decision tree (IGR-WODT) for DRL policy extraction. 3) Policy evaluation. We define three metrics to evaluate the performance of the proposed approach. A case study for the under-voltage load shedding problem shows that the IGR-WODT presents a performance enhancement compared with DRL, weighted oblique decision tree, and univariate decision tree. The proposed policy extraction method could provide an intuitive explanation of the neural network decision-making process to the dispatchers when making final decisions on power grid operation. Also, the resulted rule-based controller could replace the deep neural network-based controller in many field edge devices with limited computing resources, providing comparable performance.
engineering, electrical & electronic