Design and Implementation of a Multi-Label Chinese Text Categorization System

Junli Chen,Xuezhong Zhou,Zhaohui Wu
DOI: https://doi.org/10.1109/wcica.2004.1341906
2004-01-01
Abstract:Based on the Chinese character representation and the boosting algorithm, a multi-label Chinese text categorization system is demonstrated. This system has been successfully tested on two multi-labeled datasets, namely traditional Chinese medicine (TCM) dataset- TCM-MED and Reuters21578. Experiments have also been carried out to compare the performance of the boosting algorithm with two other traditional algorithms on the two datasets mentioned above. The results indicate that the boosting algorithm outperforms the other two algorithms in Chinese text categorization.
What problem does this paper attempt to address?