Hierarchical classification of Chinese documents based onN-grams

Zhou Shui-geng,Guan Jihong,He Yanxiang
DOI: https://doi.org/10.1007/bf03160278
2001-01-01
Wuhan University Journal of Natural Sciences
Abstract:We explore the techniques of utilizingN-gram information to categorize Chinese text documents hierarchically so that the classifier can shake off the burden of large dictionaries and complex segmentation processing, and subsequently be domain and time independent. A hierarchical Chinese text classifier is implemented. Experimental results show that hierarchically classifying Chinese text documents basedN-grams can achieve satisfactory performance and outperforms the other traditional Chinese text classifiers.
What problem does this paper attempt to address?