Solutions to General Clustering Algorithmic Issues

姜园,张朝阳,仇佩亮,戚玉鹏
DOI: https://doi.org/10.3969/j.issn.1007-0249.2004.03.021
2004-01-01
Abstract:Clustering is widely used in several fields such as statistics, machine learning, pattern recognition and numerical analysis. Recently, more and more attention has been paid to it. In this paper, five issues commonly concerned are discussed, they are: assessment of clustering results, estimation of total number of clusters, data preparation, measures of data proximity and outlier handling. Representative solutions to these issues are surveyed, conclusions are summed up, development trend of algorithms to deal with these five issues is forecasted.
What problem does this paper attempt to address?