Abstract:The hidden vulnerability of distributed learning systems against Byzantine attacks has been investigated by recent researches and, fortunately, some known defenses showed the ability to mitigate Byzantine attacks when a minority of workers are under adversarial control. Yet, our community still has very little knowledge on how to handle the situations when the proportion of malicious workers is 50% or more. Based on our preliminary study of this open challenge, we find there is more that can be done to restore Byzantine robustness in these more threatening situations, if we better utilize the auxiliary information inside the learning process. In this paper, we propose Justinian's GAAvernor (GAA), a Gradient Aggregation Agent which learns to be robust against Byzantine attacks via reinforcement learning techniques. Basically, GAA relies on utilizing the historical interactions with the workers as experience and a quasi-validation set, a small dataset that consists of less than 10 data samples from similar data domains, to generate reward signals for policy learning. As a complement to existing defenses, our proposed approach does not bound the expected number of malicious workers and is proved to be robust in more challenging scenarios. Through extensive evaluations on four benchmark systems and against various adversarial settings, our proposed defense shows desirable robustness as if the systems were under no attacks, even in some case when 90% Byzantine workers are controlled by the adversary. Meanwhile, our approach shows a similar level of time efficiency compared with the state-of-the-art defenses. Moreover, GAA provides highly interpretable traces of worker behavior as by-products for further mitigation usages like Byzantine worker detection and behavior pattern analysis.

Communication-Efficient and Byzantine-Robust Distributed Stochastic Learning with Arbitrary Number of Corrupted Workers

Asynchronous Byzantine-Robust Stochastic Aggregation with Variance Reduction for Distributed Learning

Efficient Byzantine-Resilient Stochastic Gradient Desce

Byzantine Robustness and Partial Participation Can Be Achieved at Once: Just Clip Gradient Differences

Adaptive Distributed Learning with Byzantine Robustness: A Gradient-Projection-Based Method

Byzantine-Robust and Communication-Efficient Distributed Learning via Compressed Momentum Filtering

Robust Distributed Learning Against Both Distributional Shifts and Byzantine Attacks

Mitigating Model Poisoning Attacks on Distributed Learning with Heterogeneous Data

Communication-Efficient Distributed Learning via Sparse and Adaptive Stochastic Gradient

Byzantine-Resilient Decentralized Collaborative Learning.

High Dimensional Distributed Gradient Descent with Arbitrary Number of Byzantine Attackers

Buffered Asynchronous SGD for Byzantine Learning

Justinian's GAAvernor: Robust Distributed Learning with Gradient Aggregation Agent.

Byzantine-resilient Decentralized Stochastic Gradient Descent

One-Bit Byzantine-Tolerant Distributed Learning via Over-the-Air Computation

BASGD: Buffered Asynchronous SGD for Byzantine Learning

Byzantine-robust decentralized stochastic optimization with stochastic gradient noise-independent learning error

Fall of Empires: Breaking Byzantine-tolerant SGD by Inner Product Manipulation

Byzantine-Resilient Non-Convex Stochastic Gradient Descent

Byzantine-Robust Compressed and Momentum-based Variance Reduction in Federated Learning

RSA: Byzantine-Robust Stochastic Aggregation Methods for Distributed Learning from Heterogeneous Datasets