Ranking based algorithms for learning from positive and unlabeled examples

Yu Mao,Yanquan Zhou
DOI: https://doi.org/10.1109/FSKD.2012.6233854
2012-01-01
Abstract:Many real-world classification applications fall into the problems of learning from positive (P) and unlabeled examples (U). Most of the algorithms proposed to the problems are based on two-step strategy: 1) identifying a set of reliable negative examples (RN) from U; 2) applying a standard classification algorithm to RN and P. Intuitively, the capacities of negative extracting methods (NEMs) in step 1 are critical since the classifiers used in step 2 can be very sensitive to the noise in RN. Unfortunately, most of the existing NEMs are based on the assumption that there are plenty of positive examples and cannot work when there is a paucity of positive examples. Furthermore, most studies did not try to extract positive examples from U. It is conceivable that a classifier trained on an enlarged P (by adding positive examples extracted from U to P) could have better performance. Therefore, we propose rank-based algorithms which extract both reliable positive and negative examples from U. We then use these examples to train the subsequent classifiers. The experimental results show that our proposed approaches can greatly enhance the effectiveness of follow-up classifiers, especially when the size of P is small.
What problem does this paper attempt to address?