Abstract:The traditional K -means clustering algorithm is easily to fall into local optimum because the initial clustering centers are generated randomly.In this paper,an improved K -means algorithm based on distance and the expected density was proposed.This algorithm takes the distance and density distribution characteristics of the data objects into account.It chooses the first k objects with the largest mutual distances and the highest densities as the initial centers.This guarantees the algorithm could achieve a global optimal solution.Experiments on UCI datasets show that the improved algorithm gains better performances.

A K -MEANS CLUSTERING ALGORITHM BASED ON DISTANCE AND EXPECTED DENSITY PARAMETER