Abstract:Evasion attack in multi-label learning systems is an interesting, widely witnessed, yet rarely explored research topic. Characterizing the crucial factors determining the attackability of the multi-label adversarial threat is the key to interpret the origin of the adversarial vulnerability and to understand how to mitigate it. Our study is inspired by the theory of adversarial risk bound. We associate the attackability of a targeted multi-label classifier with the regularity of the classifier and the training data distribution. Beyond the theoretical attackability analysis, we further propose an efficient empirical attackability estimator via greedy label space exploration. It provides provably computational efficiency and approximation accuracy. Substantial experimental results on real-world datasets validate the unveiled attackability factors and the effectiveness of the proposed empirical attackability indicator

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to evaluate the robustness of multi - label classifiers under evasion attacks, that is, their attackability. Specifically, the researchers focus on: 1. **Identifying the key factors affecting the attackability of multi - label classifiers**: Through theoretical analysis, the authors reveal that the attack strength, the regularity of the classifier, and the empirical loss on unperturbed data are the external and internal driving forces that determine the attackability of the classifier. In addition, they also explore the role of low - rank regularization and adversarial training in strengthening the classifier. 2. **Developing a computable method for measuring the attackability of multi - label classifiers**: To meet this challenge, the authors propose an effective empirical attackability estimator based on greedy label space exploration and prove its computational efficiency and approximate accuracy. ### Specific Problem Description In multi - label learning systems, evasion attacks are a widespread but rarely studied topic. The key to understanding these attacks lies in identifying the key factors that determine the attackability of adversarial threats, which helps to explain the sources of adversarial vulnerability and understand how to mitigate this vulnerability. ### Research Motivation Inspired by the theory of adversarial risk bounds, the researchers link the attackability of the target multi - label classifier to the regularity of the classifier and the training data distribution. In addition to theoretical analysis, they also propose a method for efficiently estimating empirical attackability through greedy label space exploration. This method provides guarantees of computational efficiency and approximate accuracy. ### Main Contributions 1. **Theoretical Analysis**: The authors measure the attackability of multi - label classifiers by evaluating the expected worst - case misclassification loss on the distribution of adversarial samples. For linear classifiers and deep neural networks, they reveal that the attack strength, the regularity of the classifier, and the empirical loss on unperturbed data are the main factors determining attackability. 2. **Empirical Evaluation**: The authors transform the empirical attackability evaluation problem into a label space exploration process for each legal input instance and show the triviality of this problem by formulating it as a sub - modular set function optimization problem. This can be solved by greedy search with guaranteed approximate accuracy. 3. **Algorithm Design**: To address the computational bottleneck of the original greedy search, the authors propose the Greedy Attack Space Expansion (GASE) algorithm. This algorithm provides a computationally economical marginal gain estimator and selects the label with the largest marginal gain for efficient attack target exploration. ### Experimental Verification The authors conducted experiments on multiple real - world datasets, including Creepware in network security practice, Genbase in biological research, VOC2012 in object recognition, and Planet in environmental research. The experimental results verify the theoretical analysis and demonstrate the effectiveness of the proposed GASE algorithm in estimating empirical attackability. ### Summary This paper aims to comprehensively understand and quantify the robustness of multi - label classifiers under evasion attacks through theoretical analysis and empirical evaluation. The research results not only reveal the key factors affecting attackability but also propose an effective evaluation method, providing theoretical basis and technical means for improving the robustness of multi - label classifiers.

Characterizing the Evasion Attackability of Multi-label Classifiers

Attack Transferability Characterization for Adversarially Robust Multi-label Classification

Attack As Defense: Characterizing Adversarial Examples Using Robustness.

Clean-image Backdoor: Attacking Multi-label Models with Poisoned Labels Only

Spot evasion attacks: Adversarial examples for license plate recognition systems with convolutional neural networks

When Measures are Unreliable: Imperceptible Adversarial Perturbations toward Top-$k$ Multi-Label Learning

Analysis and Detection against Network Attacks in the Overlapping Phenomenon of Behavior Attribute

Showing Many Labels in Multi-label Classification Models: An Empirical Study of Adversarial Examples

Attack Tree Analysis for Adversarial Evasion Attacks

Multi-Label Adversarial Attack Based on Label Correlation

Multi-SpacePhish: Extending the Evasion-space of Adversarial Attacks against Phishing Website Detectors using Machine Learning

Adversarial Evasion Attack Efficiency against Large Language Models

OMG-ATTACK: Self-Supervised On-Manifold Generation of Transferable Evasion Attacks

An Evasion and Counter-Evasion Study in Malicious Websites Detection

Advanced Evasion Attacks and Mitigations on Practical ML-Based Phishing Website Classifiers

Query-efficient label-only attacks against black-box machine learning models

Characterizing the Optimal 0-1 Loss for Multi-class Classification with a Test-time Attacker

A multi-label network attack detection approach based on two-stage model fusion

Identifying Adversarial Attacks on Text Classifiers

Panda or not Panda? Understanding Adversarial Attacks with Interactive Visualization

Infighting in the Dark: Multi-Labels Backdoor Attack in Federated Learning