Resilient Mechanism Against Byzantine Failure for Distributed Deep Reinforcement Learning

Mingyue Zhang,Zhi Jin,Jian Hou,Renwei Luo
DOI: https://doi.org/10.1109/issre55969.2022.00044
2022-01-01
Abstract:Distributed deep reinforcement learning(DDRL) has been used in distributed systems to better improve the adaptability. However, DDRL-based systems are also inevitably under the threat of Byzantine workers. There is an urgent need to enhance the resilience of the DDRL-based system against Byzantine failures. This paper proposes a resilient mechanism for mitigating the influence of Byzantine workers on DDRL-based systems. First, we formalize the DDRL-based system as a multi-armed bandit model for well capturing the collective effect of workers on the whole learning process, and then transforming the resilient mechanism design problem into the sampling policy optimization problem. Second, we propose a self-adaptation process for filtering out the harmful data generated by Byzantine workers and theoretically give a mathematical analysis of the understanding, demonstrating its effectiveness under ideal conditions. Third, based on a typical DDRL-based system (i.e., Asynchronous Advantage Actor-Critic, A3C), we implement a resilient distributed A3C (ReD-A3C). With extensive experiments on the DDRL benchmark tasks, we show that ReD-A3C outperforms available Byzantine tolerant approaches.
What problem does this paper attempt to address?