An Improved Knn Text Classification Algorithm Based On Density

Kansheng Shi,Lemin Li,Haitao Liu,Jie He,Naitong Zhang,Wentao Song
DOI: https://doi.org/10.1109/CCIS.2011.6045043
2011-01-01
Abstract:Text classification has gained booming interest over the past few years. As a simple, effective and nonparametric classification method, KNN method is widely used in document classification. However, the uneven distribution in training set will affect the KNN classified result negatively. Moreover, the uneven distribution phenomenon of text is very common in documents on the Web. To tackling on this, this paper proposes. an improved KNN method denoted by DBKNN. Experimental results show that the DBKNN algorithm can better serve classification requests for large sets of unevenly distributed documents.
What problem does this paper attempt to address?