Rare Category Detection Forest.

Haiqin Weng,Zhenguang Liu,Kevin Chiew,Qinming He
DOI: https://doi.org/10.1007/978-3-319-25159-2_55
2015-01-01
Abstract:Rare category detecion RCD aims to discover rare categories in a massive unlabeled data set with the help of a labeling oracle. A challenging task in RCD is to discover rare categories which are concealed by numerous data examples from major categories. Only a few algorithms have been proposed for this issue, most of which are on quadratic or cubic time complexity. In this paper, we propose a novel tree-based algorithm known as RCD-Forest with $$O\\varphi n \\log {n/s}$$ time complexity and high query efficiency where n is the size of the unlabeled data set. Experimental results on both synthetic and real data sets verify the effectiveness and efficiency of our method.
What problem does this paper attempt to address?