Document Classification Based on Support Vector Machine Using a Concept Vector Model

Shuang Deng,Hong Peng
DOI: https://doi.org/10.1109/WI.2006.65
2006-01-01
Abstract:This paper proposes a new method for document categorization, based on support vector machine (SVM) using a concept vector model (CVM). The traditional document classification usually ignores the semantic relations among the keywords or documents. To effectively solve the semantic problem, the domain ontology is used to capture the semantic information among different terms or keywords in the documents. Using the concept vector model, domain-related semantic information more exactly from documents can be extracted. In the model, concept vector is extracted from a document by the matching method. According to concept features of the documents, documents are classified into a suitable category by SVM. The experimental results show that our CVM method yields higher accuracy compared to the traditional term-based vector space model (VSM) methods.
What problem does this paper attempt to address?