ON CHINESE TEXT CATEGORIZATION BASED ON ROUGH SET AND ENSEMBLE LEARNING

Zhang Xiang,Zhou Mingquan,Dong Lili,Yan Qingbo
DOI: https://doi.org/10.3969/j.issn.1000-386X.2011.01.010
2011-01-01
Abstract:This paper introduces the flow of Chinese text categorisation and the relevant technologies.A text categorisation approach based on the combination of rough set and ensemble learning is proposed on the basis of analyzing the disadvantage of traditional feature selection,the feature selection of the text is executed through the rough set,and an ensemble learning algorithm AdaBoost.M1 is employed to improve the categorising performance of weak classifier to categorise the Chinese text.Experiment indicates that this method has a more excellent classification performance with its F1 value of the categorised outcome higher than that of the C4.5 and the kNN classifiers.
What problem does this paper attempt to address?