Pruning training samples using a supervised clustering algorithm

Minzhang Huang,Hai Zhao,Bao-Liang Lu
DOI: https://doi.org/10.1007/978-3-642-13318-3_32
2010-01-01
Abstract:As practical pattern classification tasks are often very-large scale and serious imbalance such as patent classification, using traditional pattern classification techniques in a plain way to deal with these tasks has shown inefficient and ineffective In this paper, a supervised clustering algorithm based on min-max modular network with Gaussian-zero-crossing function is adopted to prune training samples in order to reduce training time and improve generalization accuracy The effectiveness of the proposed training sample pruning method is verified on a group of real patent classification tasks by using support vector machines and nearest neighbor algorithm.
What problem does this paper attempt to address?