Multi-Agent Reinforcement Learning With Privacy Preservation for Continuous Double Auction-Based P2P Energy Trading

Jiehui Zheng,Ze-Ting Liang,Yuanzheng Li,Zhigang Li,Qing-Hua Wu
DOI: https://doi.org/10.1109/tii.2023.3348823
IF: 12.3
2024-01-01
IEEE Transactions on Industrial Informatics
Abstract:With increasing deployment of distributed energy resources, the energy market which aims for local generation and load profile redistribution is facing the challenge to accommodate various types of participants. To realize social welfare maximization with privacy preserving in a dynamic energy market, this article propose a multiagent reinforcement learning (MARL) method for quotation decision optimization in continuous double auction (CDA)-based peer-to-peer (P2P) energy market. To address the nonstationarity and privacy violation brought by multiagent context, we utilize mean-field approximation to abstract the unauthorized local information of other agents from the public market dynamics. An abstract Q-value function is developed for each agent to infer the neighbor agents' local observation and action through the public clearing results in the dynamic CDA market. Moreover, to avoid sparse reward so as to stabilize the learning process, we propose a dynamic potential-based reward shaping term in the reward. Without altering the learnt optimal policies, the agents can be informed with the additional energy storage state as the reward shaping in each time instants. To validate the effectiveness and economy of our proposed method, simulation studies are conducted on a real-world dataset. Simulation results show that the proposed MARL method produces up to 17% more convergent episodic reward and 67% less energy bills which indicates competitive convergence performance and significant economic benefits.
automation & control systems,computer science, interdisciplinary applications,engineering, industrial
What problem does this paper attempt to address?