Feature Selection Method on Imbalanced Text

LIAO Yi-xing,PAN Xue-zeng
DOI: https://doi.org/10.3969/j.issn.1001-0548.2012.04.022
2012-01-01
Abstract:After analyzing the four basic information elements of traditional feature selection methods, a new measurement of strong class information is introduced and a new feature selection method is proposed for imbalanced text classification. The strong class information and the frequency of terms are used to improve the classification performance of minority classes and majority classes respectively. The experiments on reuter-21578 dataset show that the proposed method is better than IG and CHI. Both Micro F 1 and Macro F 1 are improved to some degree.
What problem does this paper attempt to address?