Semi-supervised Classification of Concept Drift Data Stream Based on Local Component Replacement

Keke Qin,Yimin Wen
DOI: https://doi.org/10.1007/978-981-13-2122-1_8
2018-01-01
Abstract:Being compared with traditional data mining, data stream has three distinct characteristics which pose new challenges to machine learning and data mining. These challenges will become more serious when only few instances are labeled in data stream. In the paper, based on the algorithm of SPASC, a strategy of local component replacement for updating classifier pool is proposed. The proposed strategy defines a vector based on local accuracy to evaluate the adaptability of each "component" of a cluster-based classifier to a new chunk and makes the trained cluster-based classifiers in the pool adapt to the current concept better and faster while retaining as much learned knowledge as possible. The proposed algorithm is compared with the state of the art baseline methods on multiple datasets, the experimental results illustrate the effectiveness of the proposed algorithm.
What problem does this paper attempt to address?