A New Measurement of Similarity and Related Clustering Algorithm

HUANG Jian-peng,LU Li-qiang
DOI: https://doi.org/10.15943/j.cnki.fdxb-jns.2006.02.007
2006-01-01
Abstract:Distance and neighbor are of common measurements for the similarity of clusters.After the experimental analysis for the performance of the algorithms based on Euclidean distance and KNN,the definition of SNN(shared k-nearest neighbor) is presented.A new measurement of similarity based on SNN and KNN is introduced and its implementation algorithm is also included.With the clustering results of the simple data,the complex one as well as the network intrusion dataset of KDD Cup'99,the algorithm based on SNN is proven to produce more accurate and more natural clusters for the datasets of different density,size,shape than the k-means.Moreover the algorithm is not sensitive to the choice of parameters.
What problem does this paper attempt to address?