Text Categorization Based On Term Co-Occurrence Concept

Maoshu Ni,Hongfei Lin
2007-01-01
Abstract:Feature selection of traditional text categorization takes words as units, and establishes vector space model to express all the documents according to the features weighting. However, each word of this vector space model is separate, so the semantic relations between words have not been realized. Based on the association rule of data mining, this paper presents a new text-representation method based on traditional VSM and term co-occurrence concept, and applies this method to text categorization. The experiment indicates that this method is better to express the semantic content of the document than the traditional VSM and achieves but also better categorization result.
What problem does this paper attempt to address?