Abstract:We study adversarial perturbations when the instances are uniformly distributed over $\{0,1\}^n$. We study both "inherent" bounds that apply to any problem and any classifier for such a problem as well as bounds that apply to specific problems and specific hypothesis classes. As the current literature contains multiple definitions of adversarial risk and robustness, we start by giving a taxonomy for these definitions based on their goals, we identify one of them as the one guaranteeing misclassification by pushing the instances to the error region. We then study some classic algorithms for learning monotone conjunctions and compare their adversarial risk and robustness under different definitions by attacking the hypotheses using instances drawn from the uniform distribution. We observe that sometimes these definitions lead to significantly different bounds. Thus, this study advocates for the use of the error-region definition, even though other definitions, in other contexts, may coincide with the error-region definition. Using the error-region definition of adversarial perturbations, we then study inherent bounds on risk and robustness of any classifier for any classification problem whose instances are uniformly distributed over $\{0,1\}^n$. Using the isoperimetric inequality for the Boolean hypercube, we show that for initial error $0.01$, there always exists an adversarial perturbation that changes $O(\sqrt{n})$ bits of the instances to increase the risk to $0.5$, making classifier's decisions meaningless. Furthermore, by also using the central limit theorem we show that when $n\to \infty$, at most $c \cdot \sqrt{n}$ bits of perturbations, for a universal constant $c< 1.17$, suffice for increasing the risk to $0.5$, and the same $c \cdot \sqrt{n} $ bits of perturbations on average suffice to increase the risk to $1$, hence bounding the robustness by $c \cdot \sqrt{n}$.

Provable Robustness of Adversarial Training for Learning Halfspaces with Noise

Efficiently Learning Adversarially Robust Halfspaces with Noise

Non-Convex SGD Learns Halfspaces with Adversarial Label Noise

Attribute-Efficient Learning of Halfspaces with Malicious Noise: Near-Optimal Label Complexity and Noise Tolerance

Efficient Testable Learning of General Halfspaces with Adversarial Label Noise

Low Rank Matrix Recovery with Adversarial Sparse Noise

Enhancing PAC Learning of Half spaces Through Robust Optimization Techniques

Efficient Active Learning Halfspaces with Tsybakov Noise: A Non-convex Optimization Approach

Efficient PAC Learning of Halfspaces with Constant Malicious Noise Rate

Online Learning of Halfspaces with Massart Noise

Robustness to Label Noise Depends on the Shape of the Noise Distribution in Feature Space

Adversarial Risk and Robustness: General Definitions and Implications for the Uniform Distribution

Reliable Learning of Halfspaces under Gaussian Marginals

Robustra: Training Provable Robust Neural Networks over Reference Adversarial Space.

Distributionally robust halfspace depth

Agnostic Learnability of Halfspaces via Logistic Loss

Classification Under Misspecification: Halfspaces, Generalized Linear Models, and Connections to Evolvability

Adversarial Robustness Limits via Scaling-Law and Human-Alignment Studies

Adversarial Robustness with Partial Isometry

A High Dimensional Statistical Model for Adversarial Training: Geometry and Trade-Offs

Adversarial Training and Robustness for Multiple Perturbations