Learning Effective Features for Chinese Text Categorization

DS Luo,XH Wang,XH Wu,HS Chi
DOI: https://doi.org/10.1109/nlpke.2005.1598809
2005-01-01
Abstract:Text categorization task always suffers from a high dimension problem, which leads the learning system to be in a status of either lower efficiency or lower performance. A number of feature selection methods have therefore been adopted or proposed for its dimensional reduction, such as DF, IG, Chi Square and so on. Unlike those traditional feature selection methods, in this paper, a feature selection method based on the idea of "discriminative learning" is presented, where those learned "effective" features rather than traditional "important" features are used to construct feature space. During learning effective features, a variant AdaBoost algorithm as well as a pairwise multiclass learning scheme are adopted. Simulation results show the presented method works well.
What problem does this paper attempt to address?