Nearest Neighbor Ensembles: an Effective Method for Difficult Problems in Streaming Classification with Emerging New Classes.

Xin-Qiang Cai,Peng Zhao,Kai-Ming Ting,Xin Mu,Yuan Jiang
DOI: https://doi.org/10.1109/icdm.2019.00109
2019-01-01
Abstract:This paper re-examines existing systems in streaming classification with emerging new classes (SENC) problems, where new classes that have not been used to train a classifier may emerge in a data stream. We identify that existing systems have an unspecified assumption that emerging new classes are geometrically far from known classes, or instances of known classes are densely distributed, in the feature space. Using a class separation indicator alpha, we refine the SENC problem into an alpha-SENC problem, where alpha indicates a geometric distance between two classes in the feature space. We show that while most existing systems work well in high-alpha SENC problems (i.e., a new class is geometrically far from a known class or instances of known classes are densely distributed), they perform poorly in low-alpha SENC problems. To solve low-alpha SENC problems effectively, we propose an approach using nearest neighbor ensembles or SENNE. We demonstrate that SENNE is able to handle both the low-alpha and high-alpha SENC problems which can appear at different times in a single data stream.
What problem does this paper attempt to address?