Interactive Trimming against Evasive Online Data Manipulation Attacks: A Game-Theoretic Approach

Yue Fu,Qingqing Ye,Rong Du,Haibo Hu
2024-03-15
Abstract:With the exponential growth of data and its crucial impact on our lives and decision-making, the integrity of data has become a significant concern. Malicious data poisoning attacks, where false values are injected into the data, can disrupt machine learning processes and lead to severe consequences. To mitigate these attacks, distance-based defenses, such as trimming, have been proposed, but they can be easily evaded by white-box attackers. The evasiveness and effectiveness of poisoning attack strategies are two sides of the same coin, making game theory a promising approach. However, existing game-theoretical models often overlook the complexities of online data poisoning attacks, where strategies must adapt to the dynamic process of data collection.
Cryptography and Security,Databases
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is the threats faced by data integrity in the era of big data and artificial intelligence, especially the destructive impact of malicious data poisoning attacks on the machine - learning process. Specifically: 1. **Threats of data poisoning attacks**: With the exponential growth of data volume and its important influence on our lives and decision - making, data integrity becomes crucial. Malicious entities manipulate data by injecting false values, thus distorting the analysis results, which poses a serious threat to machine - learning systems relying on high - quality data. 2. **Limitations of existing defense strategies**: Traditional distance - based defense methods (such as pruning, that is, removing data points whose distance from most data points exceeds a certain threshold) can reduce the impact of poisoned data, but are easily evaded by white - box attackers. These attackers can dynamically adjust their strategies to bypass static defense measures. 3. **Complexity of online data poisoning attacks**: Existing game - theory models usually ignore the complexity of online data poisoning attacks. In this case, the attack strategy must adapt to the dynamic process of data collection. Therefore, a defense mechanism that can cope with dynamically changing attack strategies is required. To solve the above problems, the author proposes an interactive game - theory model, aiming to defend against online data manipulation attacks through pruning strategies. The main features of this model include: - **Complete strategy space**: The model covers all possible attack and defense strategies and is applicable to powerful evasive and collusive attackers. - **Application of theoretical physics principles**: The principle of least action and the Euler - Lagrange equation are used to establish an analytical model for describing the game process. - **Practical application cases**: A case study was carried out in a privacy - protected data collection system under local differential privacy (LDP), and two strategies - Tit - for - tat and Elastic - were proposed, demonstrating their effectiveness and accuracy in different scenarios. Through this model, the author hopes to find the balance point (Stackelberg equilibrium) between attackers and defenders in a dynamic environment, so as to effectively resist data poisoning attacks while maintaining data quality and system robustness.