Abstract:Collecting training data from untrusted sources exposes machine learning services to poisoning adversaries, who maliciously manipulate training data to degrade the model accuracy. When trained on offline datasets, poisoning adversaries have to inject the poisoned data in advance before training, and the order of feeding these poisoned batches into the model is stochastic. In contrast, practical systems are more usually trained/fine-tuned on sequentially captured real-time data, in which case poisoning adversaries could dynamically poison each data batch according to the current model state. In this paper, we focus on the real-time settings and propose a new attacking strategy, which affiliates an accumulative phase with poisoning attacks to secretly (i.e., without affecting accuracy) magnify the destructive effect of a (poisoned) trigger batch. By mimicking online learning and federated learning on MNIST and CIFAR-10, we show that model accuracy significantly drops by a single update step on the trigger batch after the accumulative phase. Our work validates that a well-designed but straightforward attacking strategy can dramatically amplify the poisoning effects, with no need to explore complex techniques.

What problem does this paper attempt to address?

### Problems the paper attempts to solve This paper aims to solve the problem of poisoning attacks faced by machine - learning models in real - time data streams. Specifically, the paper focuses on how attackers can reduce the accuracy of the model by dynamically injecting malicious data into the training data in a real - time data stream environment. Different from poisoning attacks in offline datasets, in real - time data streams, attackers can dynamically adjust their attack strategies during the model training process and poison each data batch according to the current model state. #### Main research questions 1. **Challenges of poisoning attacks in real - time data streams**: - In a real - time data stream environment, attackers can dynamically adjust the poisoned data according to the model state, which makes traditional poisoning attack methods no longer applicable. - To meet this challenge, the paper proposes a new attack strategy - accumulative poisoning attacks - to amplify the destructive effect of a single trigger batch. 2. **Mechanism of accumulative poisoning attacks**: - The accumulative poisoning attack makes the model state sensitive to a specific trigger batch through an accumulative phase, thereby significantly reducing the model's accuracy after a single update step. - This accumulative phase does not affect the model's accuracy to avoid being detected by the monitoring system, thus ensuring the stealth of the attack. 3. **Experimental verification**: - The paper verifies the effectiveness of the accumulative poisoning attack by simulating the online learning and federated learning processes on the MNIST and CIFAR - 10 datasets. - The experimental results show that after the accumulative phase, the model's accuracy can drop sharply from 82.09% to 27.66% with just one update step. #### Formula representation The objective function of the accumulative poisoning attack can be represented as: \[ \min_{P,A} \nabla_\theta L(S_{val}; A(\theta_T))^\top \nabla_\theta L(P(S_T); A(\theta_T)) \] where: - \( L(S_{val}; A(\theta_T)) \) is the loss function on the validation set, and \( A(\theta_T) \) represents the model parameters after the accumulative phase. - \( P(S_T) \) is the poisoned trigger batch. - \( \nabla_\theta L(S_{val}; A(\theta_T)) \) and \( \nabla_\theta L(P(S_T); A(\theta_T)) \) are the gradients of the validation set and the trigger batch respectively. By optimizing the above objective function, the accumulative poisoning attack can make the model sensitive to a specific trigger batch without affecting the model's accuracy, so that it can quickly collapse when triggered. #### Conclusion By proposing the accumulative poisoning attack strategy, the paper shows that in a real - time data stream environment, attackers can amplify the effect of poisoning attacks through ingenious design. This finding emphasizes the importance of protecting machine - learning models from poisoning attacks in real - time data streams and provides new ideas for future defense mechanisms.

Accumulative Poisoning Attacks on Real-time Data

Oblivion: Poisoning Federated Learning by Inducing Catastrophic Forgetting.

Exploring Model Dynamics for Accumulative Poisoning Discovery

Poisoning Attacks on Machine Learning Models in Cyber Systems and Mitigation Strategies

Indiscriminate Data Poisoning Attacks on Neural Networks

From Adversarial Examples to Data Poisoning Instances: Utilizing an Adversarial Attack Method to Poison a Transfer Learning Model

Poisoning Web-Scale Training Datasets is Practical

Hidden Poison: Machine Unlearning Enables Camouflaged Poisoning Attacks

Is Adversarial Training Really a Silver Bullet for Mitigating Data Poisoning?

A Flexible Poisoning Attack Against Machine Learning.

Exploring the Limits of Model-Targeted Indiscriminate Data Poisoning Attacks

Stronger Data Poisoning Attacks Break Data Sanitization Defenses

Invisible Poisoning: Highly Stealthy Targeted Poisoning Attack

A Poisoning Attack Against the Recognition Model Trained by the Data Augmentation Method.

Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching

A Concealed Poisoning Attack to Reduce Deep Neural Networks’ Robustness Against Adversarial Samples

Lethal Dose Conjecture on Data Poisoning

Amplifying Membership Exposure via Data Poisoning

Indiscriminate poisoning attacks are shortcuts

Transferable Availability Poisoning Attacks

Model Poisoning Attack on Neural Network Without Reference Data