Research on K-means Text Clustering Algorithm Based on Semantic

Yufang Liu,Shibin Xiao,Xueqiang Lv,Shuicai Shi
DOI: https://doi.org/10.1109/CCIE.2010.39
2010-01-01
Abstract:Through research on K-means algorithm of text clustering and semantic-based vector space model, a semantic-based K-means text clustering model is proposed to solve the problem on high-dimensional and sparse characteristics of text data set. The model reduces the semantic loss of the text data and improves the quality of text clustering. Experiments prove that semantic-based text clustering increases by more 6 percent than non-semantic-based one in the final evaluation of the F1 index value.
What problem does this paper attempt to address?