Abstract:Artificial intelligence systems are prevalent in everyday life, with use cases in retail, manufacturing, health, and many other fields. With the rise in AI adoption, associated risks have been identified, including privacy risks to the people whose data was used to train models. Assessing the privacy risks of machine learning models is crucial to enabling knowledgeable decisions on whether to use, deploy, or share a model. A common approach to privacy risk assessment is to run one or more known attacks against the model and measure their success rate. We present a novel framework for running membership inference attacks against classification models. Our framework takes advantage of the ensemble method, generating many specialized attack models for different subsets of the data. We show that this approach achieves higher accuracy than either a single attack model or an attack model per class label, both on classical and language classification tasks.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to improve the accuracy of Membership Inference Attacks (MIA) against classification models. Specifically, the authors propose a new framework to improve membership inference attacks on classification models by using an ensemble method to generate multiple attack models specifically for different data subsets. ### Problem Background With the wide application of artificial intelligence (AI) systems, privacy risks have gradually become the focus of attention. In particular, in machine learning (ML) models, membership inference attacks aim to distinguish samples in the training data (members) from samples in non - training data (non - members). Such attacks can reveal whether the model has leaked the data information used for training, thereby evaluating the privacy risks of the model. ### Research Objectives Existing membership inference attack methods usually use a single attack model or attack models trained separately according to class labels, and these methods have limited effectiveness in some cases. To improve the accuracy of the attack, this paper proposes a new framework to improve membership inference attacks in the following ways: 1. **Ensemble Method**: Divide the initial member and non - member data sets into multiple non - overlapping small subsets and train specialized attack models for each subset. 2. **Model Optimization**: For each subset, try multiple combinations (including attack model architectures, input features, and scaling methods), and select the best combination to achieve the highest attack performance. 3. **Result Aggregation**: Aggregate the results of multiple attack models to more comprehensively reflect the real leakage situation of the target model. ### Main Contributions - Proposed a new framework based on the ensemble method, which significantly improves the accuracy of membership inference attacks. - This framework is applicable to classical models and large - language models (LLM), and performs well when confronting models with applied privacy defense measures. - Experimental results show that compared with a single attack model or attack models trained separately according to class labels, the new framework has a significant improvement in performance on various data sets. Through this method, researchers can more accurately evaluate the privacy risks of machine - learning models, thereby helping organizations make more informed decisions and ensuring the security and privacy of the models.

Improved Membership Inference Attacks Against Language Classification Models

Membership Inference Attacks against Language Models via Neighbourhood Comparison

Defenses to Membership Inference Attacks: A Survey

A Method to Facilitate Membership Inference Attacks in Deep Learning Models

Membership Inference Attacks Against In-Context Learning

Systematic Evaluation of Privacy Risks of Machine Learning Models

SoK: Reducing the Vulnerability of Fine-tuned Language Models to Membership Inference Attacks

Membership Inference Attacks on Machine Learning: A Survey

Privacy Risks of Securing Machine Learning Models against Adversarial Examples

Privacy Analysis of Deep Learning in the Wild: Membership Inference Attacks against Transfer Learning

Membership Inference Attack Susceptibility of Clinical Language Models

Machine Learning with Membership Privacy using Adversarial Regularization

Confidence Is All You Need for MI Attacks

On the (In)Feasibility of Attribute Inference Attacks on Machine Learning Models

Membership Inference Attacks and Defenses in Classification Models

Membership reconstruction attack in deep neural networks

How to Combine Membership-Inference Attacks on Multiple Updated Machine Learning Models

How to Combine Membership-Inference Attacks on Multiple Updated Models

Membership Inference Attacks Against Recommender Systems

On the Discredibility of Membership Inference Attacks