HDenDist: Nonlinear Hierarchical Clustering Based on Density and Min-distance

Wen-Qi Fan,Chang-Dong Wang,Yuan-Wei Chen,Jian-Huang Lai
DOI: https://doi.org/10.1109/bdcloud.2015.16
2015-01-01
Abstract:Hierarchical clustering has received a great amount of attention due to the capability of capturing hierarchical cluster structure in an unsupervised way. Despite great success, most of the existing hierarchical clustering algorithms have some drawbacks: (1) difficulty in selecting clusters to merge or split, (2) inefficient and inaccurate cluster validation, (3) limitation to only linearly separable clusters. To address the above issues, this paper proposes a new nonlinear hierarchical clustering method termed HDenDist. The proposed method is based on two observations/designs associated with density and min-distance. One is that cluster centers have a higher density and are surrounded by data points of lower density, and the distance between cluster centers is relatively long, the other is that we design a min-distance between nodes, which can be used to determine how to divide the nodes in the hierarchical tree into two sub-cluster nodes. Some dividing and ruling tricks are designed that can further reduce the sensitivity to parameters. What's more, the density and distance are combined to determine when to terminate the split of the cluster nodes. In experimental studies, the proposed method has shown promising results on real datasets.
What problem does this paper attempt to address?