Abstract:To coordinate the interests of operator and users in a microgrid under complex and changeable operating conditions, this paper proposes a microgrid scheduling model considering the thermal flexibility of thermostatically controlled loads and demand response by leveraging physical informed-inspired deep reinforcement learning (DRL) based bi-level programming. To overcome the non-convex limitations of karush-kuhn-tucker (KKT)-based methods, a novel optimization solution method based on DRL theory is proposed to handle the bi-level programming through alternate iterations between levels. Specifically, by combining a DRL algorithm named asynchronous advantage actor-critic (A3C) and automated machine learning-prioritized experience replay (AutoML-PER) strategy to improve the generalization performance of A3C to address the above problems, an improved A3C algorithm, called AutoML-PER-A3C, is designed to solve the upper-level problem; while the DOCPLEX optimizer is adopted to address the lower-level problem. In this solution process, AutoML is used to automatically optimize hyperparameters and PER improves learning efficiency and quality by extracting the most valuable samples. The test results demonstrate that the presented approach manages to reconcile the interests between multiple stakeholders in MG by fully exploiting various flexibility resources. Furthermore, in terms of economic viability and computational efficiency, the proposal vastly exceeds other advanced reinforcement learning methods.

What problem does this paper attempt to address?

This paper attempts to solve the coordination problems among different stakeholders in microgrid (MG) scheduling, especially under complex and changeable operating conditions. Specifically, the article focuses on the following issues: 1. **Benefit Coordination**: How to coordinate the benefits between operators and users in the microgrid, especially when considering the thermal flexibility of thermostatically controlled loads (TCLs) and demand response. 2. **Non - convex Optimization Problem**: Traditional methods based on Karush - Kuhn - Tucker (KKT) conditions have limitations in dealing with non - convex optimization problems and cannot effectively cope with the complexity and uncertainty in microgrid scheduling. 3. **Multi - agent Interaction**: How to model the interaction of different entities in the energy trading process through bi - level programming and optimize the decision - making at each level on this basis. To solve these problems, the authors propose a physics - information - inspired deep reinforcement learning (DRL) method, which combines the asynchronous advantage actor - critic (A3C) algorithm and the automated machine learning - prioritized experience replay (AutoML - PER) strategy. This method aims to improve the generalization performance of the A3C algorithm, better solve the bi - level programming problem, and improve the economic feasibility and computational efficiency of microgrid scheduling. ### Specific Solutions 1. **Bi - level Scheduling Model**: A bi - level scheduling model is constructed, where the upper level is the pricing strategy of the microgrid operator and the lower level is the energy consumption strategy of the user. 2. **DRL Algorithm**: The AutoML - PER - A3C algorithm is used to solve the upper - level problem, and the DOCPLEX optimizer is used to solve the lower - level problem. By alternatingly iterating the two levels, the non - convex limitations of traditional methods are overcome. 3. **Improved A3C Algorithm**: AutoML is introduced to automatically optimize hyper - parameters, and PER is used to extract the most valuable samples to improve the learning efficiency and quality. 4. **Physics - Information - inspired Reward Mechanism**: A reward function based on physical information is designed to help guide operators to make better scheduling decisions. ### Mathematical Formula Representation - **State Transition Equation**: \[ \dot{T}_n(t)=\frac{1}{C_{air,n}}[T_{out}(t)-T_n(t)]+\frac{1}{C_{m,n}}[T_{m,n}(t)-T_n(t)]+P_{TCL,n}B(T_n(t), s_{i,n})+Q_{air} \] - **Objective Function**: \[ \max F_1 = f_{income}-f_{cost} \] where, \[ f_{income}=\lambda_t\sum_l P_{load,t}+\lambda_{gen}\sum_n P_{TCL,n}B(T_n(t), s_{i,n})+\lambda_{sell,t}P_{sell,t} \] \[ f_{cost}=\zeta\sum_k\max(P_{TSL,t}, 0)+P_{buy,t}(\lambda_{buy,t}+\mu_{import})+\mu_{export}P_{sell,t} \] - **Constraint Conditions**: - Price Constraint: \[ \lambda_{min}\leq\lambda_t\leq\lambda_{max} \] - Power Balance Constraint: \[

Physical Informed-Inspired Deep Reinforcement Learning Based Bi-Level Programming for Microgrid Scheduling

A Bayesian Deep Reinforcement Learning-Based Resilient Control for Multi-Energy Micro-Gird

Data-Driven Online Energy Scheduling of a Microgrid Based on Deep Reinforcement Learning

Optimal Scheduling of Isolated Microgrids Using Automated Reinforcement Learning-Based Multi-Period Forecasting

Double Deep Q-learning Based Real-Time Optimization Strategy for Microgrids

Microgrid Energy Management Using Improved Reinforcement Learning with Quadratic Programming

A Holistic Power Optimization Approach for Microgrid Control Based on Deep Reinforcement Learning

Optimizing Load Scheduling in Power Grids Using Reinforcement Learning and Markov Decision Processes

Online EVs Vehicle-to-Grid Scheduling Coordinated with Multi-Energy Microgrids: A Deep Reinforcement Learning-Based Approach

Deep Reinforcement Learning Microgrid Optimization Strategy Considering Priority Flexible Demand Side

Deep reinforcement learning for real-time economic energy management of microgrid system considering uncertainties

Optimal Scheduling in IoT-Driven Smart Isolated Microgrids Based on Deep Reinforcement Learning

Multi-Objective Interval Optimization Dispatch of Microgrid Via Deep Reinforcement Learning

Parametric Dueling DQN- and DDPG-Based Approach for Optimal Operation of Microgrids

Real-Time Microgrid Energy Scheduling Using Meta-Reinforcement Learning

Model-Based Reinforcement Learning Method for Microgrid Optimization Scheduling

Coordinated Energy and Reserve Sharing of Isolated Microgrid Cluster using Deep Reinforcement Learning

Dueling Double Q-learning Based Real-time Energy Dispatch in Grid-connected Microgrids

Online Scheduling of PV and Energy Storage System Based on Deep Reinforcement Learning

Bi-level optimization of charging scheduling of a battery swap station based on deep reinforcement learning

Real-time Optimal Energy Management of Microgrid with Uncertainties Based on Deep Reinforcement Learning