Research of Clustering Algorithms Based on Text Mining

XU Dong-liang,DONG Kai-kun,LI Bin,WANG Yan-fen
DOI: https://doi.org/10.3969/j.issn.2095-6835.2011.05.067
2011-01-01
Abstract:With the acceleration of massive data on Internet, how to extract information needed effectively has been become an important issue in text mining. This paper mainly studies the application of K-means algorithm and K-medoids algorithm in text mining. Experiments have been conducted to evaluate the performance of the algorithms in accuracy rate and the recall rate based on artificial appraisable standard. Experiment results show that K-medoids algorithm is 5 percent higher than K-means algorithm in terms of accuracy and the recall rate, and the former is more robust in dealing with abnormal and noise data.
What problem does this paper attempt to address?