Mining Arbitrary Shaped Clusters and Outputting a High Quality Dendrogram.

Hao Huang,Song Wang,Shuangke Wu,Yunjun Gao,Wei Lu,Qinming He,Shi Ying
DOI: https://doi.org/10.1007/978-3-319-44403-1_10
2016-01-01
Abstract:Hierarchical clustering HC for short outputs a dendrogram that offers more topological information than flat clustering e.g., k-means. However, the existing HC algorithms focus on either the quality of the dendrogram or the ability of mining arbitrary shaped clusters. To address the above two aspects simultaneously, we present HICMEN by adopting 1 the classic agglomerative clustering framework that can generate a complete dendrogram, and 2 a novel﾿similarity measure based on mutual k-nearest neighbors to capture the connectivity of data points and help properly merge up each arbitrary shaped cluster piece by piece. More importantly, we prove that the similarity measure has a nice property called weak monotonicity, which guarantees the quality of the dendrogram generated by HICMEN. Extensive experimental results show that HICMEN is capable of mining arbitrary shaped clusters effectively, and can simultaneously output a high quality dendrogram.
What problem does this paper attempt to address?