Policy evaluation for reinforcement learning over asynchronous multi-agent networks

Xingyu Sha,Jiaqi Zhang,Keyou You
DOI: https://doi.org/10.23919/ccc52363.2021.9550466
2021-01-01
Abstract:This paper proposes a fully asynchronous algorithm for policy evaluation of multi-agent reinforcement learning over networks. Without any form of coordination, agents can communicate with neighbors and compute their local variables using (possibly) delayed information at any time. Thus, the proposed scheme fully takes advantage of the distributed setting. We prove that our method converges to a neighborhood of the optimum at a linear rate, showing the computational advantage by reducing the amount of synchronization. Numerical experiments show that our method is robust to straggler agents.
What problem does this paper attempt to address?