Classifying skewed data streams based on reusing data

Peng Liu,LiJun Cai,Yong Wang,Longbo Zhang
DOI: https://doi.org/10.1109/ICCASM.2010.5620201
2010-01-01
Abstract:Current research community on data streams mining focuses on mining balanced data streams. However, the skewed class distribution appears in many data streams applications. In this paper, we introduce the method of discovering concept drifting on skewed data streams and propose an algorithm for classifying skewed data streams based on reusing data, RDFCSDS (Reuse Data for Classifying Skewed Data Streams). We evaluate RDFCSDS algorithm on Moving Hyperplane data set. The experiment results show that the sampling method based on reusing data works better than the simple sampling method and cluster sampling method on skewed data streams with concept drifting.
What problem does this paper attempt to address?