Class Noise Handling for Effective Cost-Sensitive Learning by Cost-Guided Iterative Classification Filtering

Xingquan Zhu,Xindong Wu
DOI: https://doi.org/10.1109/TKDE.2006.155
IF: 9.235
2006-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:Recent research in machine learning, data mining, and related areas has produced a wide variety of algorithms for cost-sensitive (CS) classification, where instead of maximizing the classification accuracy, minimizing the misclassification cost becomes the objective. These methods often assume that their input is quality data without conflict or erroneous values, or the noise impact is trivial, which is seldom the case in real-world environments. In this paper, we propose a cost-guided iterative classification filter (CICF) to identify noise for effective CS learning. Instead of putting equal weights on handling noise in all classes in existing efforts, CICF puts more emphasis on expensive classes, which makes it attractive in dealing with data sets with a large cost-ratio. Experimental results and comparative studies indicate that the existence of noise may seriously corrupt the performance of the underlying CS learners and by adopting the proposed CICF algorithm, we can significantly reduce the misclassification cost of a CS classifier in noisy environments
What problem does this paper attempt to address?