EMA: Auditing Data Removal from Trained Models

Yangsibo Huang,Xiaoxiao Li,Kai Li
DOI: https://doi.org/10.48550/arXiv.2109.03675
IF: 5.414
2021-09-08
Machine Learning
Abstract:Data auditing is a process to verify whether certain data have been removed from a trained model. A recently proposed method (Liu et al. 20) uses Kolmogorov-Smirnov (KS) distance for such data auditing. However, it fails under certain practical conditions. In this paper, we propose a new method called Ensembled Membership Auditing (EMA) for auditing data removal to overcome these limitations. We compare both methods using benchmark datasets (MNIST and SVHN) and Chest X-ray datasets with multi-layer perceptrons (MLP) and convolutional neural networks (CNN). Our experiments show that EMA is robust under various conditions, including the failure cases of the previously proposed method. Our code is available at: https://github.com/Hazelsuko07/EMA.
What problem does this paper attempt to address?