A New Semi-Supervised Learning Based Ensemble Classifier For Recurring Data Stream

Bo Zhang,Dingfang Chen,Qiaohong Zu,Yichao Mao,Yi Pan,Xiao-Min Zhang
DOI: https://doi.org/10.1007/978-3-319-09265-2_77
2014-01-01
Abstract:Stream data are always fast, real-time, infinite and change over time, in this paper, we propose a semi-supervised learning based ensemble classifier for solving recurring data concept drift problem. Our baseline classifiers group both labeled and unlabeled instances as the training points to obtain better learning efficiency from limited data samples, historical information are kept as part of weight decision factor when building the ensemble classifier, which helps keeping classifier ensemble set in a reasonable range without losing those repeated features. The empirical study shows that our new approach outperforms the general ensemble model and is suitable for recurring massive stream data classification.
What problem does this paper attempt to address?