Content-Oriented Automatic Text Categorization with the Cognitive Situation Models

Yi Guo,Zhiqing Shao,Hua Nan
DOI: https://doi.org/10.1109/iscsct.2008.63
2008-01-01
Abstract:Text categorization is an important research field within text mining. The initial objective of text categorization is to recognize, understand and organize various volumes of texts or documents. The general procedures of categorization are treated as supervised learning, from which the similarity can be inferred from a collection of categorized texts for training purpose. Obviously, the typical approaches for categorization are restrained at single word level and not content-oriented. This paper introduces an innovative research work, a content-oriented automatic text categorization algorithm (CogCate), inspired with cognitive situation models, to simulate the human cognitive procedure in the text categorization task. CogCate is not limited with traditional statistics analysis at word level, but includes a process of lexical or semantics analysis, which secures the accuracy of categorization. The evaluation experiments have testified the precision of CogCate. Meanwhile, CogCate tremendously reduces the time and effort spent on training and corpus maintenance, and proves that text categorization can benefit from interdisciplinary research efforts.
What problem does this paper attempt to address?