Improved method for text feature selection based on CHI

Jian-zhuo YAN,Peng-ying LI,Li-ying FANG,Li-ying LONG,Xin-yue LIU
DOI: https://doi.org/10.16208/j.issn1000-7024.2016.05.051
2016-01-01
Abstract:Traditional X2 statistical model fails to consider the frequency of the feature terms,an improved Chi-square statistic (CHI)algorithm based on frequency and its distribution within class and between classes was proposed to make full use of the frequency of features.The experimental results of text categorization using the improved method were compared to that of other methods.Results of analysis indicate that the proposed algorithm is better than the traditional method and verifies the effective-ness of the proposed method.
What problem does this paper attempt to address?