Study on Improved CHI for Feature Selection in Chinese Text Categorization

PEI Yingbo,LIU Xiaoxia
DOI: https://doi.org/10.3778/j.issn.1002-8331.2011.04.035
2011-01-01
Abstract:This paper analyzes the factors which influence the CHI categorization accuracy and removes the negative correlation between the items and the category.The improved approach is applied to weight adjustment,obviously improving categorization quality.Furthermore,concentration information,distribution information and frequency information are introduced into the improved approach,which increases the categorization accuracy on the corpus of category uneven distribution.The experimental results verify the efficiency and probability of the improved CHI approach.
What problem does this paper attempt to address?