Classification under Streaming Emerging New Classes: A Solution Using Completely-Random Trees

Xin Mu,Kai Ming Ting,Zhi-Hua Zhou
DOI: https://doi.org/10.1109/tkde.2017.2691702
IF: 9.235
2017-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:This paper investigates an important problem in stream mining, i.e., classification under streaming emerging new classes or SENC. The SENC problem can be decomposed into three subproblems: detecting emerging new classes, classifying known classes, and updating models to integrate each new class as part of known classes. The common approach is to treat it as a classification problem and solve it using either a supervised learner or a semi-supervised learner. We propose an alternative approach by using unsupervised learning as the basis to solve this problem. The proposed method employs completely-random trees which have been shown to work well in unsupervised learning and supervised learning independently in the literature. The completely-random trees are used as a single common core to solve all three subproblems: unsupervised learning, supervised learning, and model update on data streams. We show that the proposed unsupervised-learning-focused method often achieves significantly better outcomes than existing classification-focused methods.
What problem does this paper attempt to address?