Text Categorization Based on Frequent Patterns with Term Frequency

XY Chen,Y Chen,L Wang,YF Hu
DOI: https://doi.org/10.1109/icmlc.2004.1382032
2004-01-01
Abstract:The association categorization technology based on frequent patterns is recently presented, which build the classification rules by frequent patterns in various categories and classify the new text employing these rules. However, in the current association classification methods, shortage exists in two aspects when it is applied to classify text data: one is the method ignored the information about word's frequency in a text; the other is, the method needs pruning rules when the mass rules are generated, but that leads the veracity of classifying to drop. Therefore, this paper presents a text categorization algorithm based on frequent pattern with term frequency, and obtains higher performance than other association categorization methods and some current text classification methods. Our study provides evidence that association rule mining can be used for the construction of fast and effective classifiers for automatic text categorization.
What problem does this paper attempt to address?