Two Odds-Radio-Based Text Classification Algorithms

Zhi-Hong Deng,Shi-Wei Tang,Dong-Qing Yang,Ming Zhang,Xiao-Bin Wu,Meng Yang
DOI: https://doi.org/10.1109/WISEW.2002.1177866
2002-01-01
Abstract:Since 1990's, the exponential growth of theseWeb documents has led to a great deal of interestin developing efficient tools and software toassist users in finding relevant information. Textclassification has been proved to be useful inhelping organize and search text information onthe Web. Although there have been existed anumber of text classification algorithms, most ofthem are either inefficient or too complex. In thispaper we present two Odds-Radio-Based textclassification algorithms, which are called ORand TF*OR respectively. We have evaluated ouralgorithm on two text collections and compared itagainst k-NN and SVM. Experimental resultsshow that OR and TF*OR are competitive withk-NN and SVM. Furthermore, OR and TF*OR ismuch simpler and faster than them. The resultsalso indicate that it is not TF but relevancefactors derived from Odds Radio that play thedecisive role in document categorization.
What problem does this paper attempt to address?