Abstract:Utilizing distributed renewable and energy storage resources in local distribution networks via peer-to-peer (P2P) energy trading has long been touted as a solution to improve energy systems' resilience and sustainability. Consumers and prosumers (those who have energy generation resources), however, do not have the expertise to engage in repeated P2P trading, and the zero-marginal costs of renewables present challenges in determining fair market prices. To address these issues, we propose multi-agent reinforcement learning (MARL) frameworks to help automate consumers' bidding and management of their solar PV and energy storage resources, under a specific P2P clearing mechanism that utilizes the so-called supply-demand ratio. In addition, we show how the MARL frameworks can integrate physical network constraints to realize voltage control, hence ensuring physical feasibility of the P2P energy trading and paving way for real-world implementations.

What problem does this paper attempt to address?

The problems that this paper attempts to solve mainly focus on the following aspects: 1. **Lack of technical knowledge**: Consumers and producers (i.e., individuals with solar photovoltaic panels and/or energy storage devices) lack the technical knowledge required to participate in repeated peer - to - peer (P2P) energy trading, which makes it difficult for them to efficiently manage and trade their energy resources. 2. **Difficulties in market pricing**: Since the marginal cost of renewable energy (such as solar energy) is almost zero, this poses a challenge to determining a fair market price. The traditional uniform - price auction mechanism cannot operate effectively in this case because the market - clearing price is usually close to zero. 3. **Network feasibility**: Although P2P energy trading only involves financial transactions, the actual delivery of energy needs to be carried out through the physical distribution network. How to maintain the network feasibility in the local P2P trading market is an important open question. To solve the above problems, the paper proposes a framework based on multi - agent reinforcement learning (MARL), which aims to automate the bidding process of consumers and manage their solar photovoltaic and energy storage resources. In addition, this framework also integrates physical network constraints to achieve voltage control and ensure the physical feasibility of P2P energy trading, thus paving the way for practical applications. ### Specific solutions 1. **Multi - agent reinforcement learning (MARL) framework**: - Each agent participating in the P2P auction is modeled as a Markov decision process (MDP), allowing agents to exchange information and reach a consensus. - A consensus - based actor - critic algorithm for decentralized MARL is proposed, and each agent optimizes its decision by iteratively updating the policy and value function. 2. **Supply - demand ratio (SDR) mechanism**: - The SDR mechanism is defined as the ratio of the total supply to the total demand: \[ \text{SDR}_t=\frac{\sum_{i\in S_t}b_{i,t}}{-\sum_{i\in B_t}b_{i,t}} \] - The market - clearing price \(P_t\) is determined according to the SDR: \[ P_t = \begin{cases} (\text{FIT}-\text{UR})\cdot\text{SDR}_t+\text{UR}, & 0\leq\text{SDR}_t\leq1 \\ \text{FIT}, & \text{SDR}_t > 1 \end{cases} \] - This mechanism avoids the problem of zero - clearing price caused by zero - marginal - cost resources and simplifies the bidding process, only requiring the submission of quantities. 3. **Integration of network constraints**: - By introducing virtual penalty terms into the agent's reward function, the satisfaction of network constraints (such as voltage regulation) is ensured: \[ r_v = -\lambda\sum_{\kappa = 1}^N\text{clip}\left[\max(|V_{\kappa,t}|-V_{\kappa},V_{\kappa}-|V_{\kappa,t}|),\{0,M\}\right] \] - Where \(V_{\kappa}\) and \(V_{\kappa}\) represent the upper and lower voltage limits of node \(\kappa\) respectively, and the \(\text{clip}\) function is used to limit the voltage deviation within a reasonable range. Through these methods, the paper not only solves the problems of lack of technical knowledge and market pricing difficulties, but also ensures the feasibility of P2P energy trading in the physical network, providing theoretical and technical support for practical applications.

Peer-to-Peer Energy Trading of Solar and Energy Storage: A Networked Multiagent Reinforcement Learning Approach

Peer-to-Peer Trading for Energy-Saving Based on Reinforcement Learning

Scalable Coordinated Management of Peer-to-peer Energy Trading: A Multi-Cluster Deep Reinforcement Learning Approach

A Scalable Privacy-Preserving Multi-Agent Deep Reinforcement Learning Approach for Large-Scale Peer-to-Peer Transactive Energy Trading

Multi-Agent Reinforcement Learning With Privacy Preservation for Continuous Double Auction-Based P2P Energy Trading

Mean-Field Multi-Agent Reinforcement Learning for Peer-to-Peer Multi-Energy Trading

Peer to Peer Distributed Solar Energy Trading.

Peer-to-peer energy trading with energy trading consistency in interconnected multi-energy microgrids: A multi-agent deep reinforcement learning approach

Strategic Peer-to-peer Energy Trading Framework Considering Distribution Network Constraints

Peer-to-Peer Energy Trading and Energy Conversion in Interconnected Multi-Energy Microgrids Using Multi-Agent Deep Reinforcement Learning

Peer-to-Peer Energy Trading Using Prediction Intervals of Renewable Energy Generation.

Peer-to-peer energy trading optimization in energy communities using multi-agent deep reinforcement learning

Multi-Agent Reinforcement Learning for Automated Peer-to-Peer Energy Trading in Double-Side Auction Market

P2P Energy Trading for Coordinated Home Energy Management and Voltage Regulation

Peer-To-Peer Energy Sharing With Battery Storage: Energy Pawn In The Smart Grid

Stackelberg game and multi-agent deep reinforcement learning based peer to peer energy trading for multi-microgrids

Energy Pricing in P2P Energy Systems Using Reinforcement Learning

Deep Reinforcement Learning and Blockchain for Peer-to-Peer Energy Trading among Microgrids

Multi-Agent Learning in Double-side Auctions forPeer-to-peer Energy Trading

Federated reinforcement learning for smart building joint peer-to-peer energy and carbon allowance trading