Abstract:Conformal predictions make it possible to define reliable and robust learning algorithms. But they are essentially a method for evaluating whether an algorithm is good enough to be used in practice. To define a reliable learning framework for classification from the very beginning of its design, the concept of scalable classifier was introduced to generalize the concept of classical classifier by linking it to statistical order theory and probabilistic learning theory. In this paper, we analyze the similarities between scalable classifiers and conformal predictions by introducing a new definition of a score function and defining a special set of input variables, the conformal safety set, which can identify patterns in the input space that satisfy the error coverage guarantee, i.e., that the probability of observing the wrong (possibly unsafe) label for points belonging to this set is bounded by a predefined $\varepsilon$ error level. We demonstrate the practical implications of this framework through an application in cybersecurity for identifying DNS tunneling attacks. Our work contributes to the development of probabilistically robust and reliable machine learning models.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to combine Scalable Classifiers (SCs) and Conformal Predictions (CPs) to define a new score function, and by introducing the Conformal Safety Set (CSS), identify patterns in the input space that can meet the error - coverage guarantee. Specifically, the goals of the paper are: 1. **Define a new score function**: The paper proposes a natural score function definition based on Scalable Classifiers, which can apply the Conformal Predictions framework to any classifier in the most natural way. This solves the problem in traditional Conformal Predictions methods where the score function definition depends on specific classifiers or application scenarios. 2. **Introduce the Conformal Safety Set**: The paper introduces the concept of the Conformal Safety Set, which is a special set of input variables that can identify patterns in the input space. These patterns meet the error - coverage guarantee, that is, the probability of observing an incorrect (possibly unsafe) label for points belonging to this set is limited by a predefined error level $\varepsilon$. 3. **Provide theoretical and practical applications**: The paper not only theoretically proves the relationship between the Conformal Safety Set and Scalable Classifiers, but also demonstrates the practical application value of this framework through a practical cybersecurity application case (identifying DNS tunneling attacks). ### Main contributions of the paper - **Definition of the natural score function**: The paper proposes a score function definition applicable to any classifier, making the Conformal Predictions framework more naturally applicable to different classification tasks. - **Introduction of the Conformal Safety Set**: By introducing the Conformal Safety Set, the paper provides a method to identify reliable prediction regions in the input space, thereby improving the robustness and reliability of the classifier. - **Theoretical analysis**: The paper analyzes in detail the relationship between the Conformal Safety Set and Scalable Classifiers and proves the theoretical properties of the Conformal Safety Set. - **Practical application**: The paper demonstrates the effectiveness and practicality of the proposed framework in practical problems through an application case in the field of cybersecurity. ### Formula summary - **Definition of Scalable Classifiers**: \[ \phi_\theta(x, \rho)=\begin{cases} + 1 & \text{if } f_\theta(x, \rho)<0 \\ -1 & \text{otherwise} \end{cases} \] - **Definition of the score function**: \[ s(x, \hat{y}) = -\hat{y}\bar{\rho}(x) \] where $\bar{\rho}(x)$ satisfies $f_\theta(x, \bar{\rho}(x)) = 0$. - **Definition of the Conformal Safety Set**: \[ \Sigma_\varepsilon=\{x\in X: s(x, + 1)\leq s_\varepsilon, s(x, - 1)>s_\varepsilon\} \] where $s_\varepsilon$ is the $\left\lceil (n_c + 1)(1-\varepsilon)\right\rceil / n_c$ quantile on the calibration set. - **Theoretical properties of the Conformal Safety Set**: \[ S_\varepsilon=\{x\in X: f_\theta(x, \rho_\varepsilon)<0\}\subseteq\Sigma_\varepsilon \] \[ S_\varepsilon=\Sigma_\varepsilon\quad\text{if and only if}\quad\Sigma^b_\varepsilon = \emptyset

Conformal Predictions for Probabilistically Robust Scalable Machine Learning Classification

Probabilistic Safety Regions Via Finite Families of Scalable Classifiers

Trustworthy Classification through Rank-Based Conformal Prediction Sets

Label Noise Robustness of Conformal Prediction

Conformal Prediction: A Gentle Introduction

Conformal Prediction via Regression-as-Classification

Provably Robust Conformal Prediction with Improved Efficiency

A Conformal Prediction Score that is Robust to Label Noise

Neurosymbolic Conformal Classification

Conformalized Credal Regions for Classification with Ambiguous Ground Truth

Conformal Prediction for Deep Classifier via Label Ranking

Robust Conformal Prediction under Distribution Shift via Physics-Informed Structural Causal Model

Robust Yet Efficient Conformal Prediction Sets

Robust Conformal Prediction Using Privileged Information

Improving Expert Predictions with Conformal Prediction

Generalization and Informativeness of Conformal Prediction

Conformal Prediction with Learned Features

Verifiably Robust Conformal Prediction

An Information Theoretic Perspective on Conformal Prediction

Certifiably Byzantine-Robust Federated Conformal Prediction

The Penalized Inverse Probability Measure for Conformal Classification