An Improved Condensing Algorithm

Xiulan Hao,Chenghong Zhang,Hexiang Xu,Xiaopeng Tao,Shuyun Wang,Yunfa Hu
DOI: https://doi.org/10.1109/icis.2008.67
2008-01-01
Abstract:kNN classifier is widely used in text categorization, however, kNN has the large computational and store requirements, and its performance also suffers from uneven distribution of training data. Usually, condensing technique is resorted to reducing the noises of training data and decreasing the cost of time and space. Traditional condensing technique picks up samples in a random manner when initialization. Though random sampling is one means to reduce outliers, the extremely stochastic may lead to bad performance sometimes, that is, advantages of sampling may be suppressed. To avoid such a misfortune, we propose a variation of traditional condensing technique. Experiment results illustrate this strategy can solve above problems effectively.
What problem does this paper attempt to address?