Abstract:In many binary classification applications such as disease diagnosis and spam detection, practitioners often face great needs to control type I errors (i.e., the conditional probability of misclassifying a class 0 observation as class 1) so that it remains below a desired threshold. To address this need, the Neyman-Pearson (NP) classification paradigm is a natural choice; it minimizes type II error (i.e., the conditional probability of misclassifying a class 1 observation as class 0) while enforcing an upper bound, $\alpha$, on the type I error. Although the NP paradigm has a century-long history in hypothesis testing, it has not been well recognized and implemented in classification schemes. Common practices that directly limit the empirical type I error to no more than $\alpha$ do not satisfy the type I error control objective because the resulting classifiers are still likely to have type I errors much larger than $\alpha$. As a result, the NP paradigm has not been properly implemented for many classification scenarios in practice. In this work, we develop the first umbrella algorithm that implements the NP paradigm for all scoring-type classification methods, including popular methods such as logistic regression, support vector machines and random forests. Powered by this umbrella algorithm, we propose a novel graphical tool for NP classification methods: NP receiver operating characteristic (NP-ROC) bands, motivated by the popular receiver operating characteristic (ROC) curves. NP-ROC bands will help choose $\alpha$ in a data adaptive way and compare different NP classifiers. We demonstrate the use and properties of the NP umbrella algorithm and NP-ROC bands, available in the R package nproc, through simulation and real data case studies.

Optimal ROC-Based Classification and Performance Analysis under Bayesian Uncertainty Models

Small-sample precision of ROC-related estimates

Optimal Decision-Theoretic Classification Using Non-Decomposable Performance Metrics

Leveraging Uncertainty Estimates To Improve Classifier Performance

Optimal Bayesian design for model discrimination via classification

Binormal Precision–Recall Curves for Optimal Classification of Imbalanced Data

MCMC implementation of the optimal Bayesian classifier for non-Gaussian models: model-based RNA-Seq classification

Robust Classification Using Posterior Probability Threshold Computation Followed by Voronoi Cell Based Class Assignment Circumventing Pitfalls of Bayesian Analysis of Biomedical Data

Neyman-Pearson (NP) classification algorithms and NP receiver operating characteristics (NP-ROC)

Bayesian ROC surface estimation under verification bias

Bayesian Bootstrap Inference for the ROC Surface

Tailored Bayes: a risk modelling framework under unequal misclassification costs

An efficient variance estimator of AUC and its applications to binary classification

Never mind the metrics -- what about the uncertainty? Visualising confusion matrix metric distributions

The receiver operating characteristic curve accurately assesses imbalanced datasets

A Unified Framework For Performance Analysis Of Bayesian Inference

Optimal strategies for reject option classifiers

The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets

A placement-value based approach to concave ROC analysis

Multiclass ROC

Improving Image-Based Precision Medicine with Uncertainty-Aware Causal Models