Leveraging permutation testing to assess confidence in positive-unlabeled learning applied to high-dimensional biological datasets

Shiwei Xu,Margaret E. Ackerman
DOI: https://doi.org/10.1186/s12859-024-05834-2
IF: 3.307
2024-06-21
BMC Bioinformatics
Abstract:Compared to traditional supervised machine learning approaches employing fully labeled samples, positive-unlabeled (PU) learning techniques aim to classify "unlabeled" samples based on a smaller proportion of known positive examples. This more challenging modeling goal reflects many real-world scenarios in which negative examples are not available—posing direct challenges to defining prediction accuracy and robustness. While several studies have evaluated predictions learned from only definitive positive examples, few have investigated whether correct classification of a high proportion of known positives (KP) samples from among unlabeled samples can act as a surrogate to indicate model quality.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?