An improved clustering algorithm for web document

Jing Wang,Zhijing Liu
2009-01-01
Journal of Information and Computational Science
Abstract:To satisfy the major challenges of current web document clustering, first we proposed a noval multiple feature vector space model to describe the web special features, and then a Simple Agglomerative hierarchical K-Means clustering algorithm is introduced for the web documenst clustering. Experimental results indicate that the noval algorithm is siginificatly improving the quality of clustering result, compared with the traditonal algorithm, and also reduced the algorithm's running time by 30%. Copyright ©2009 Binary Information Press.
What problem does this paper attempt to address?