A CHINESE DOCUMENT CATEGORIZATION SYSTEM WITHOUT DICTIONARY SUPPORT AND SEGMENTATION PROCESSING

周水庚,关佶红,胡运发,周傲英
2001-01-01
Journal of Computer Research and Development
Abstract:In this paper, a Chinese document categorization system without dictionary support and segmentation processing is developed, in which the N gram information instead of Chinese words is used so that the classifier can shake off the support of dictionaries and segmentation processing and subsequently become domain and time independent, and an open architecture is adopted to facilitate functional expansion and performance improvement. Experimental results show that it can achieve satisfying categorization performance.
What problem does this paper attempt to address?