An Improved TF-IDF Approach for Text Classification

Zhang Yun-tao,Gong Ling,Wang Yong-cheng
DOI: https://doi.org/10.1631/bf02842477
2004-01-01
Journal of Zhejiang University SCIENCE A
Abstract:This paper presents a new improved term frequency/inverse document frequency (TF-IDF) approach which uses confidence, support and characteristic words to enhance the recall and precision of text classification. Synonyms defined by a lexicon are processed in the improved TF-IDF approach. We detailedly discuss and analyze the relationship among confidence, recall and precision. The experiments based on science and technology gave promising results that the new TF-IDF approach improves the precision and recall of text classification compared with the conventional TF-IDF approach.
What problem does this paper attempt to address?