Semantic scoring based on small-world phenomenon for feature selection in text mining

Chong Huang,Yonghong Tian,Tiejun Huang,Wen Gao
DOI: https://doi.org/10.1007/11811305_70
2006-01-01
Abstract:This paper proposes an effective scoring scheme for feature selection in Text Mining, using characteristics of Small-World Phenomenon on the semantic networks of documents. Our focus is on the reservation of both syntactic and statistical information of words, rather than solely simple frequency summarization in prevailing scoring schemes, such as TFIDF. Experimental results on TREC dataset show that our scoring scheme outperforms the prevailing schemes.
What problem does this paper attempt to address?