Calibration methods in imbalanced binary classification

Théo Guilbert,Olivier Caelen,Andrei Chirita,Marco Saerens
DOI: https://doi.org/10.1007/s10472-024-09952-8
IF: 1.019
2024-07-20
Annals of Mathematics and Artificial Intelligence
Abstract:The calibration problem in machine learning classification tasks arises when a model's output score does not align with the ground truth observed probability of the target class. There exist several parametric and non-parametric post-processing methods that can help to calibrate an existing classifier. In this work, we focus on binary classification cases where the dataset is imbalanced, meaning that the negative target class significantly outnumbers the positive one. We propose new parametric calibration methods designed to this specific case and a new calibration measure focusing on the primary objective in imbalanced problems: detecting infrequent positive cases. Experiments on several datasets show that, for imbalanced problems, our approaches outperform state-of-the-art methods in many cases.
computer science, artificial intelligence,mathematics, applied
What problem does this paper attempt to address?