Rare Category Detection on O(dN) Time Complexity.

Zhenguang Liu,Hao Huang,Qinming He,Kevin Chiew,Lianhang Ma
DOI: https://doi.org/10.1007/978-3-319-06605-9_41
2014-01-01
Abstract:Rare category detection (RCD) aims at finding out at least one data example of each rare category in an unlabeled data set with the help of a labeling oracle to prove the existence of such a rare category. Various approaches have been proposed for RCD with quadratic or even cubic time complexity. In this paper, by using histogram density estimation and wavelet analysis, we propose FRED algorithm and its prior-free version iFRED algorithm for RCD, both of which achieve linear time complexity w.r.t. either the data set size N or the data dimension d. Theoretical analysis guarantees its effectiveness, and comprehensive experiments on both synthetic and real data sets verify the effectiveness and efficiency of our algorithms.
What problem does this paper attempt to address?