A Survey of Document Clustering

LIU Yuan-chao,WANG Xiao-long,XU Zhi-ming,GUAN Yi
DOI: https://doi.org/10.3969/j.issn.1003-0077.2006.03.009
2006-01-01
Abstract:As an unsupervised machine learning method,document clustering has been widely used in many NLP applications such as information retrieval,automatic multi-document summarization and etc.In this paper the background and the architecture of document clustering is discussed firstly,and then some related problems are surveyed which includes clustering algorithm,feature space construction,dimension reduction and the semantic problem.In the end this paper introduces the evaluation of cluster quality.
What problem does this paper attempt to address?