Abstract:Differential privacy (DP), as a rigorous mathematical definition quantifying privacy leakage, has become a well-accepted standard for privacy protection. Combined with powerful machine learning (ML) techniques, differentially private machine learning (DPML) is increasingly important. As the most classic DPML algorithm, DP-SGD incurs a significant loss of utility, which hinders DPML's deployment in practice. Many studies have recently proposed improved algorithms based on DP-SGD to mitigate utility loss. However, these studies are isolated and cannot comprehensively measure the performance of improvements proposed in algorithms. More importantly, there is a lack of comprehensive research to compare improvements in these DPML algorithms across utility, defensive capabilities, and generalizability. We fill this gap by performing a holistic measurement of improved DPML algorithms on utility and defense capability against membership inference attacks (MIAs) on image classification tasks. We first present a taxonomy of where improvements are located in the ML life cycle. Based on our taxonomy, we jointly perform an extensive measurement study of the improved DPML algorithms, over twelve algorithms, four model architectures, four datasets, two attacks, and various privacy budget configurations. We also cover state-of-the-art label differential privacy (Label DP) algorithms in the evaluation. According to our empirical results, DP can effectively defend against MIAs, and sensitivity-bounding techniques such as per-sample gradient clipping play an important role in defense. We also explore some improvements that can maintain model utility and defend against MIAs more effectively. Experiments show that Label DP algorithms achieve less utility loss but are fragile to MIAs. ML practitioners may benefit from these evaluations to select appropriate algorithms. To support our evaluation, we implement a modular re-usable software, DPMLBench,(1) which enables sensitive data owners to deploy DPML algorithms and serves as a benchmark tool for researchers and practitioners.

New Secure Sparse Inner Product with Applications to Machine Learning

SHAPER: A General Architecture for Privacy-Preserving Primitives in Secure Machine Learning.

Private Set Intersection for Unequal Set Sizes with Mobile Applications.

DPMLBench: Holistic Evaluation of Differentially Private Machine Learning

PD-ML-Lite: Private Distributed Machine Learning from Lighweight Cryptography

Towards Secure and Practical Machine Learning Via Secret Sharing and Random Permutation

SecureML: A System for Scalable Privacy-Preserving Machine Learning

PrivLogit: Efficient Privacy-preserving Logistic Regression by Tailoring Numerical Optimizers

SSNet: A Lightweight Multi-Party Computation Scheme for Practical Privacy-Preserving Machine Learning Service in the Cloud

Inner product encryption from Middle-Product Learning With Errors

ESCAPED: Efficient Secure and Private Dot Product Framework for Kernel-based Machine Learning Algorithms with Applications in Healthcare

Online Context-aware Streaming Data Release with Sequence Information Privacy

Low-Latency Privacy-Preserving Deep Learning Design via Secure MPC

PrivPy: Enabling Scalable and General Privacy-Preserving Machine Learning

Verifiable inner product computation on outsourced database for authenticated multi-user data sharing

Privacy efficient federal learning approach for smart security

An Efficient Privacy-aware Split Learning Framework for Satellite Communications

Scalable Privacy-Preserving Distributed Learning

VPiP: Values Packing in Paillier for Communication Efficient Oblivious Linear Computations

On the Complexity of Inner Product Similarity Join

Faster CryptoNets: Leveraging Sparsity for Real-World Encrypted Inference