Automatic Text Classification Based On Rough Set And Improved Quick-Reduce Algorithm

Minghu Jiang,Beixing Deng,Xiaowei W. Sheng,Xiaofang Tang,Qiuqi Ruan,Baozong Yuan
2004-01-01
Abstract:This paper proposes a fast dimensionality reduction algorithm for automatic text classifications (TC). which introduces Rough Set theory that can greatly reduce the document vector dimensions by the reduction algorithm. The experimental results prove that the proposed algorithm is very successful, it can not only keep important low-frequency words but also remove high-frequency words with no use in classification. Thus our algorithm reduces effectively the dimensional space, and reaches higher accuracy while losing less useful information compared with the conventional reduction methods.
What problem does this paper attempt to address?