Abstract:Federated Learning (FL) allows multiple participating clients to train machine learning models collaboratively while keeping their datasets local and only exchanging the gradient or model updates with a coordinating server. Existing FL protocols are vulnerable to attacks that aim to compromise data privacy and/or model robustness. Recently proposed defenses focused on ensuring either privacy or robustness, but not both. In this paper, we focus on simultaneously achieving differential privacy (DP) and Byzantine robustness for cross-silo FL, based on the idea of learning from history. The robustness is achieved via client momentum, which averages the updates of each client over time, thus reducing the variance of the honest clients and exposing the small malicious perturbations of Byzantine clients that are undetectable in a single round but accumulate over time. In our initial solution DP-BREM, DP is achieved by adding noise to the aggregated momentum, and we account for the privacy cost from the momentum, which is different from the conventional DP-SGD that accounts for the privacy cost from the gradient. Since DP-BREM assumes a trusted server (who can obtain clients' local models or updates), we further develop the final solution called DP-BREM+, which achieves the same DP and robustness properties as DP-BREM without a trusted server by utilizing secure aggregation techniques, where DP noise is securely and jointly generated by the clients. Both theoretical analysis and experimental results demonstrate that our proposed protocols achieve better privacy-utility tradeoff and stronger Byzantine robustness than several baseline methods, under different DP budgets and attack settings.

Distributed Momentum for Byzantine-resilient Learning

Ordered Momentum for Asynchronous SGD

Momentum Tracking: Momentum Acceleration for Decentralized Deep Learning on Heterogeneous Data

Byzantine-Robust and Communication-Efficient Distributed Learning via Compressed Momentum Filtering

Asynchronous Byzantine-Robust Stochastic Aggregation with Variance Reduction for Distributed Learning

Communication-Efficient and Byzantine-Robust Distributed Stochastic Learning with Arbitrary Number of Corrupted Workers

Momentum Benefits Non-iid Federated Learning Simply and Provably

Byzantine-Robust Compressed and Momentum-based Variance Reduction in Federated Learning

An Accelerated Distributed Stochastic Gradient Method with Momentum

Byzantine Robustness and Partial Participation Can Be Achieved at Once: Just Clip Gradient Differences

Parallel Momentum Methods Under Biased Gradient Estimations

Efficient Byzantine-Resilient Stochastic Gradient Desce

Momentum Provably Improves Error Feedback!

DP-BREM: Differentially-Private and Byzantine-Robust Federated Learning with Client Momentum

Decentralized Deep Learning using Momentum-Accelerated Consensus

Quasi-Global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data

Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates

Momentum Approximation in Asynchronous Private Federated Learning

Byzantine-Resilient Learning Beyond Gradients: Distributing Evolutionary Search

A Unified Momentum-based Paradigm of Decentralized SGD for Non-Convex Models and Heterogeneous Data

Byzantine-Robust Distributed Online Learning: Taming Adversarial Participants in An Adversarial Environment