Chinese Text Classification Based on Lda and Ksvm

Congwei Liang,Yong Liu,Haiqing Du
DOI: https://doi.org/10.2991/jimet-15.2015.70
2015-01-01
Abstract:With the rapid development of information technology and social networking, the amount of generated text data has increased enormously. As one of the crucial technologies for information organization and management, text classification has become much more significant in the area of machine learning and natural language processing. According to this paper, we present a text classification system. First, we apply LDA topic model to express the text instead of Boolean model or vector space model. Then, we choose KSVM which combines SVM with KNN as the classification algorithm. Finally, we choose documents with large amount of Chinese news for experiments. Compared with normal language models, these experimental data shows that our system gets higher classification accuracy.
What problem does this paper attempt to address?