Abstract:Hierarchical clustering allows better performance in grouping heterogeneous and non-spherical datasets than the center-based clustering, at the expense of increased time complexity. Meanwhile, the bottom-up approach of hierarchical clustering methods often tend to be sensitive or vulnerable to datasets containing obscure cluster boundaries. This paper presents an effective method for hierarchical clustering, called HCNN, which utilizes two types of structural similarities in nearest neighbor graph of a dataset to group similar data into clusters. In particular, the first metric is used to identify those pairs of data with maximal similarity in their nearest neighborhoods that can serve as local centers of every initial cluster. This can contribute to observing the boundaries and detecting clusters, hubs and outliers, thereby alleviating remarkably the influence of obscure boundaries between clusters. The initial clusters will be merged recursively in accordance with the second similarity metric measured in terms of the connectivity between clusters in the nearest neighbor graph. In this case, rather than combining two clusters with highest similarity during each iteration, we consider the maximum similarity as a transitive and closure relation between clusters, i.e., an equivalence relation, which enables more effective and efficient merging of clusters via application of advanced data structures. Experiments based on synthetic and real datasets demonstrate that the proposed HCNN algorithms can possibly outperform the state-of-the-art clustering methods evaluated in terms of the clustering accuracy, normalized mutual information and adjust rand index. Moreover, experiments on unlabeled real datasets show that our method enables effective identification of the quantity of clusters based on Davies–Bouldin index and average silhouette coefficient.

Hierarchical Subtrees Agglomerative Clustering Algorithms

Data Aggregation for Hierarchical Clustering

Hierarchical clustering by aggregating representatives in sub-minimum-spanning-trees

Effective hierarchical clustering based on structural similarities in nearest neighbor graphs

Cluster Merging and Splitting in Hierarchical Clustering Algorithms

Boosting cluster tree with reciprocal nearest neighbors scoring

Scalable Hierarchical Agglomerative Clustering

Mining Arbitrary Shaped Clusters and Outputting a High Quality Dendrogram.

The Impact of Isolation Kernel on Agglomerative Hierarchical Clustering Algorithms

Fair Algorithms for Hierarchical Agglomerative Clustering

Efficient hierarchical clustering of large high dimensional datasets.

Hierarchical Clustering in Astronomy

Varying Density Spatial Clustering Based on a Hierarchical Tree

Order preserving hierarchical agglomerative clustering

Scalable clustering by aggregating representatives in hierarchical groups

A Hierarchical Clustering Algorithm Based on Fuzzy Graph Connectedness.

A Coding Hierarchy Computing Based Clustering Algorithm

A Novel Hierarchical Clustering Approach Based on Universal Gravitation

HICC: an Entropy Splitting-Based Framework for Hierarchical Co-Clustering

Semi-supervised Hierarchical Ensemble Clustering Based on an Innovative Distance Metric and Constraint Information

Hierarchical Agglomerative Clustering with Information Feedback