FICW: Frequent Itemset Based Text Clustering with Window Constraint

Zhou Chong,Lu Yansheng,Zou Lei,Hu Rong
DOI: https://doi.org/10.1007/bf02829264
2006-01-01
Abstract:Most of the existing text clustering algorithms overlook the fact that one document is a word sequence with semantic information. There is some important semantic information existed in the positions of words in the sequence. In this paper, a novel method named Frequent Itemset-based Clustering with Window (FICW) was proposed, which makes use of the semantic information for text clustering with a window constraint. The experimental results obtained from tests on three (hypertext) text sets show that FICW outperforms the method compared in both clustering accuracy and efficiency.
What problem does this paper attempt to address?