UTrace: Poisoning Forensics for Private Collaborative Learning

Evan Rose,Hidde Lycklama,Harsh Chaudhari,Anwar Hithnawi,Alina Oprea
2024-09-23
Abstract:Privacy-preserving machine learning (PPML) enables multiple data owners to contribute their data privately to a set of servers that run a secure multi-party computation (MPC) protocol to train a joint ML model. In these protocols, the input data remains private throughout the training process, and only the resulting model is made available. While this approach benefits privacy, it also exacerbates the risks of data poisoning, where compromised data owners induce undesirable model behavior by contributing malicious datasets. Existing MPC mechanisms can mitigate certain poisoning attacks, but these measures are not exhaustive. To complement existing poisoning defenses, we introduce UTrace: a framework for User-level Traceback of poisoning attacks in PPML. Utrace computes user responsibility scores using gradient similarity metrics aggregated across the most relevant samples in an owner's dataset. UTrace is effective at low poisoning rates and is resilient to poisoning attacks distributed across multiple data owners, unlike existing unlearning-based methods. We introduce methods for checkpointing gradients with low storage overhead, enabling traceback in the absence of data owners at deployment time. We also design several optimizations that reduce traceback time and communication in MPC. We provide a comprehensive evaluation of UTrace across four datasets from three data modalities (vision, text, and malware) and show its effectiveness against 10 poisoning attacks.
Cryptography and Security,Machine Learning
What problem does this paper attempt to address?
This paper attempts to solve the source - tracing problem of data - poisoning attacks in Privacy - Preserving Machine Learning (PPML). Specifically, the authors focus on how to effectively identify and track malicious users in a private collaborative learning environment. These users affect the behavior of the training model by providing tampered data. ### Main problems 1. **Risks of data - poisoning attacks**: - In PPML, although the input data remains private throughout the training process, this also increases the risk of data - poisoning attacks. Attackers can induce the model to exhibit bad behavior by contributing malicious data sets. 2. **Limitations of existing methods**: - Although the existing multi - party secure computation (MPC) mechanisms can mitigate some poisoning attacks, these measures are not comprehensive. In particular, the existing unlearning - based methods are not effective in dealing with distributed poisoning attacks across multiple data owners. ### Solution: UTrace Framework To address the above challenges, the authors introduce the UTrace framework, a new method for user - level tracing of data - poisoning attacks. The main features of UTrace include: - **User - liability scoring**: - UTrace uses gradient similarity measures to calculate user - liability scores. Specifically, it calculates the liability score for each user by aggregating the gradient similarity scores of the most relevant samples in the user's data set. - **Checkpoint method with low storage overhead**: - In order to perform tracing during deployment without the participation of data owners, UTrace introduces new checkpoint methods, such as storing the final - layer gradients and random - gradient projections, to reduce the required storage space. - **Optimization of communication and time efficiency**: - UTrace has designed a variety of optimization techniques to reduce the number of communication rounds and running time in MPC, making it more efficient in practical applications. ### Formula representation The formulas used in the UTrace framework are as follows: 1. **Sample - level impact score \(I_{\text{cos}}\)**: \[ I_{\text{cos}}(z, \hat{z})=\sum_{t \in T} \eta_t \frac{\langle\nabla_\theta \ell(\theta_t; z), \nabla_\theta \ell(\theta_t; \hat{z})\rangle}{\|\nabla_\theta \ell(\theta_t; z)\|_2\|\nabla_\theta \ell(\theta_t; \hat{z})\|_2} \] 2. **User - level impact score \(I_{\text{s}}^{\text{cos}}\)**: \[ I_{\text{s}}^{\text{cos}}(D_i, \hat{z}, R)=\frac{1}{|D_i|} \sum_{z \in D_i} I_{\text{cos}}(z, \hat{z}) \] 3. **User - level impact score \(I_{\text{p}}^{\text{cos}}\)**: \[ I_{\text{p}}^{\text{cos}}(D_i, \hat{z}, R)=\sum_{t \in T} \eta_t \frac{\langle\nabla_\theta L(\theta_t; D_i), \nabla_\theta \ell(\theta_t; \hat{z})\rangle}{\|\nabla_\theta L(\theta_t; D_i)\|_2\|\nabla_\theta \ell(\theta_t; \hat{z})\|_2} \] 4. **Top - k selected user - liability score**