Clustering-dynamic-growing Clustering Algorithm Based on Sentence-Words Matrix

SUN Hui,CHEN Xiaoyun,MA Zhixin
DOI: https://doi.org/10.3321/j.issn:1000-0054.2005.09.020
2005-01-01
Abstract:Web information is growing exponentially. However, exact and concise information cannot be obtained using conventional search engines. A clustering-dynamic-growing clustering algorithm was developed based on sentence-words matrix to solve this problem. The algorithm is a plane-partition algorithm including three steps. In the first step of Web data preprocessing, the text is extracted and filtered; in the second step, matrix sets of some documents are constructed after the sentence-words matrix of each document is formed; and in the third step similar documents are clustered using the clustering-dynamic-growing clustering algorithm. Experimental results show that the algorithm can keep semantic connection of documents with high document cluster accuracies.
What problem does this paper attempt to address?