Chen Feng,Andrew L. Liu
Abstract:Utilizing distributed renewable and energy storage resources in local distribution networks via peer-to-peer (P2P) energy trading has long been touted as a solution to improve energy systems' resilience and sustainability. Consumers and prosumers (those who have energy generation resources), however, do not have the expertise to engage in repeated P2P trading, and the zero-marginal costs of renewables present challenges in determining fair market prices. To address these issues, we propose multi-agent reinforcement learning (MARL) frameworks to help automate consumers' bidding and management of their solar PV and energy storage resources, under a specific P2P clearing mechanism that utilizes the so-called supply-demand ratio. In addition, we show how the MARL frameworks can integrate physical network constraints to realize voltage control, hence ensuring physical feasibility of the P2P energy trading and paving way for real-world implementations.
What problem does this paper attempt to address?
The problems that this paper attempts to solve mainly focus on the following aspects:
1. **Lack of technical knowledge**: Consumers and producers (i.e., individuals with solar photovoltaic panels and/or energy storage devices) lack the technical knowledge required to participate in repeated peer - to - peer (P2P) energy trading, which makes it difficult for them to efficiently manage and trade their energy resources.
2. **Difficulties in market pricing**: Since the marginal cost of renewable energy (such as solar energy) is almost zero, this poses a challenge to determining a fair market price. The traditional uniform - price auction mechanism cannot operate effectively in this case because the market - clearing price is usually close to zero.
3. **Network feasibility**: Although P2P energy trading only involves financial transactions, the actual delivery of energy needs to be carried out through the physical distribution network. How to maintain the network feasibility in the local P2P trading market is an important open question.
To solve the above problems, the paper proposes a framework based on multi - agent reinforcement learning (MARL), which aims to automate the bidding process of consumers and manage their solar photovoltaic and energy storage resources. In addition, this framework also integrates physical network constraints to achieve voltage control and ensure the physical feasibility of P2P energy trading, thus paving the way for practical applications.
### Specific solutions
1. **Multi - agent reinforcement learning (MARL) framework**:
- Each agent participating in the P2P auction is modeled as a Markov decision process (MDP), allowing agents to exchange information and reach a consensus.
- A consensus - based actor - critic algorithm for decentralized MARL is proposed, and each agent optimizes its decision by iteratively updating the policy and value function.
2. **Supply - demand ratio (SDR) mechanism**:
- The SDR mechanism is defined as the ratio of the total supply to the total demand:
\[
\text{SDR}_t=\frac{\sum_{i\in S_t}b_{i,t}}{-\sum_{i\in B_t}b_{i,t}}
\]
- The market - clearing price \(P_t\) is determined according to the SDR:
\[
P_t = \begin{cases}
(\text{FIT}-\text{UR})\cdot\text{SDR}_t+\text{UR}, & 0\leq\text{SDR}_t\leq1 \\
\text{FIT}, & \text{SDR}_t > 1
\end{cases}
\]
- This mechanism avoids the problem of zero - clearing price caused by zero - marginal - cost resources and simplifies the bidding process, only requiring the submission of quantities.
3. **Integration of network constraints**:
- By introducing virtual penalty terms into the agent's reward function, the satisfaction of network constraints (such as voltage regulation) is ensured:
\[
r_v = -\lambda\sum_{\kappa = 1}^N\text{clip}\left[\max(|V_{\kappa,t}|-V_{\kappa},V_{\kappa}-|V_{\kappa,t}|),\{0,M\}\right]
\]
- Where \(V_{\kappa}\) and \(V_{\kappa}\) represent the upper and lower voltage limits of node \(\kappa\) respectively, and the \(\text{clip}\) function is used to limit the voltage deviation within a reasonable range.
Through these methods, the paper not only solves the problems of lack of technical knowledge and market pricing difficulties, but also ensures the feasibility of P2P energy trading in the physical network, providing theoretical and technical support for practical applications.