Abstract:Differential privacy (DP), as a rigorous mathematical definition quantifying privacy leakage, has become a well-accepted standard for privacy protection. Combined with powerful machine learning (ML) techniques, differentially private machine learning (DPML) is increasingly important. As the most classic DPML algorithm, DP-SGD incurs a significant loss of utility, which hinders DPML's deployment in practice. Many studies have recently proposed improved algorithms based on DP-SGD to mitigate utility loss. However, these studies are isolated and cannot comprehensively measure the performance of improvements proposed in algorithms. More importantly, there is a lack of comprehensive research to compare improvements in these DPML algorithms across utility, defensive capabilities, and generalizability. We fill this gap by performing a holistic measurement of improved DPML algorithms on utility and defense capability against membership inference attacks (MIAs) on image classification tasks. We first present a taxonomy of where improvements are located in the ML life cycle. Based on our taxonomy, we jointly perform an extensive measurement study of the improved DPML algorithms, over twelve algorithms, four model architectures, four datasets, two attacks, and various privacy budget configurations. We also cover state-of-the-art label differential privacy (Label DP) algorithms in the evaluation. According to our empirical results, DP can effectively defend against MIAs, and sensitivity-bounding techniques such as per-sample gradient clipping play an important role in defense. We also explore some improvements that can maintain model utility and defend against MIAs more effectively. Experiments show that Label DP algorithms achieve less utility loss but are fragile to MIAs. ML practitioners may benefit from these evaluations to select appropriate algorithms. To support our evaluation, we implement a modular re-usable software, DPMLBench,(1) which enables sensitive data owners to deploy DPML algorithms and serves as a benchmark tool for researchers and practitioners.

SecureMLDebugger: A Privacy-Preserving Machine Learning Debugging Tool.

Privacy-Preserving Collaborative Deep Learning with Unreliable Participants.

Adversarial for Good – Defending Training Data Privacy with Adversarial Attack Wisdom

Poster: Nebula: an Industrial-purpose Privacy-preserving Machine Learning System

SecureML: A System for Scalable Privacy-Preserving Machine Learning

DPMLBench: Holistic Evaluation of Differentially Private Machine Learning

SoK: Wildest Dreams: Reproducible Research in Privacy-preserving Neural Network Training

Wildest Dreams: Reproducible Research in Privacy-preserving Neural Network Training

Protecting Confidentiality, Privacy and Integrity in Collaborative Learning

Privacy Side Channels in Machine Learning Systems

Not Just Cloud Privacy: Protecting Client Privacy in Teacher-Student Learning

SEDML: Securely and efficiently harnessing distributed knowledge in machine learning

Distributed Intelligent Model for Privacy and Secrecy in Preschool Education

ML Privacy Meter: Aiding Regulatory Compliance by Quantifying the Privacy Risks of Machine Learning

Rethinking Privacy in Machine Learning Pipelines from an Information Flow Control Perspective

Privacy-Preserving Efficient Federated-Learning Model Debugging

GuardML: Efficient Privacy-Preserving Machine Learning Services Through Hybrid Homomorphic Encryption

Privacy-preserving Machine Learning through Data Obfuscation

Privacy-Preserving Machine Learning: Methods, Challenges and Directions

An Overview of Privacy in Machine Learning

Towards the Science of Security and Privacy in Machine Learning