A modular k-nearest neighbor classification method for massively parallel text categorization

Hai Zhao,Bao-Liang Lu
DOI: https://doi.org/10.1007/978-3-540-30497-5_134
2004-01-01
Abstract:This paper presents a Min-Max modular k-nearest neighbor (M3-k-NN) classification method for massively parallel text categorization. The basic idea behind the method is to decompose a large-scale text categorization problem into a number of smaller two-class subproblems and combine all of the individual modular k-NN classifiers trained on the smaller two-class subproblems into an M3-k-NN classifier. Our experiments in text categorization demonstrate that M3-k-NN is much faster than conventional k-NN, and meanwhile the classification accuracy of M3-k-NN is slightly better than that of the conventional k-NN. In practical, M3-k-NN has intimate relationship with high order k-NN algorithm; therefore, in theoretical sense, the reliability of M3-k-NN has been supported to some extend.
What problem does this paper attempt to address?