Robustness, Efficiency, or Privacy: Pick Two in Machine Learning

Youssef Allouah,Rachid Guerraoui,John Stephan
DOI: https://doi.org/10.48550/arXiv.2312.14712
2024-03-11
Abstract:The success of machine learning (ML) applications relies on vast datasets and distributed architectures which, as they grow, present major challenges. In real-world scenarios, where data often contains sensitive information, issues like data poisoning and hardware failures are common. Ensuring privacy and robustness is vital for the broad adoption of ML in public life. This paper examines the costs associated with achieving these objectives in distributed ML architectures, from both theoretical and empirical perspectives. We overview the meanings of privacy and robustness in distributed ML, and clarify how they can be achieved efficiently in isolation. However, we contend that the integration of these two objectives entails a notable compromise in computational efficiency. In short, traditional noise injection hurts accuracy by concealing poisoned inputs, while cryptographic methods clash with poisoning defenses due to their non-linear nature. However, we outline future research directions aimed at reconciling this compromise with efficiency by considering weaker threat models.
Machine Learning,Cryptography and Security,Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?