Robust Model Selection for Positive and Unlabeled Learning with Constraints

Tong Wei,Hai Wang,Weiwei Tu,Yufeng Li
DOI: https://doi.org/10.1007/s11432-020-3167-1
2022-01-01
Science China Information Sciences
Abstract:Positive and unlabeled (PU) learning is the problem in which training data contains only PU samples. Although PU learning is widely used in real-world applications, its model selection remains challenging. Specifically, traditional model selection methods are often highly sensitive to the class prior as well as the data size, resulting in human overhead in hyperparameter optimization. In this paper, we present a method called ODE (robust model selection) for robust model selection in PU learning. Two novel model evaluators based on the integral probability metric and area under the curve, which are free of the class prior, are introduced, and a variance reduction method is further employed to improve the quality of model selection. In addition, we perform model selection under user-defined constraints and propose a fast halving-style searching algorithm to efficiently identify the most promising model configuration. Extensive empirical studies demonstrate that our proposed method performs more robustly and is more computationally efficient than many state-of-the-art methods.
What problem does this paper attempt to address?