Clustering Description Extraction Based on Statistical Machine Learning

Chengzhi Zhang,Hongjiao Xu
DOI: https://doi.org/10.1109/iita.2008.114
2008-01-01
Abstract:Clustering description problem is one of key issues of the traditional document clustering algorithm. The traditional document algorithm can cluster the objects, but it can not give concept description for the clustered results. Document clustering description is a problem of labeling the clustered results of document collection clustering. It can help users determine whether one of the clusters is relevant to users' information requirement. Therefore, labeling a clustered set of documents is an important and challenging work in document clustering applications. To resolve the problem of the weak readability of the traditional document clustering results, a method of automatic labeling documents clusters based on machine learning is put forward. Experimental results show that the method based on SVM will provide users with more concise and comprehensive document clustering results. It also reflects the linear trend of clustering description problem.
What problem does this paper attempt to address?