Abstract:Clustering plays a pivotal role in knowledge processing, knowledge bases, and expert systems, enabling AI systems to acquire knowledge effectively. Hierarchical clustering, in particular, offers an intelligent approach to represent knowledge hierarchically by transforming raw data into one/multiple tree-shaped components. However, a notable difficulty arises when attempting to pinpoint appropriate representative points within lower levels of the cluster tree. These points are of paramount importance, as they serve as the roots for subsequent aggregation within the upper levels of the cluster tree. Traditional hierarchical clustering algorithms have relied on rudimentary techniques to select these representative points, which may not provide an adequate representation. Consequently, the resulting cluster tree often falls short in terms of empirical performance. To address this shortcoming, we proposed an innovative hierarchical clustering algorithm in this paper. The proposed algorithm is designed to efficiently identify the representative point within each sub-minimum-spanning-tree during the construction of the cluster tree, achieved by topology-based scoring the reciprocal nearest data points. Rigorous testing on UCI datasets has demonstrated the superior clustering accuracy (measured by Rand Index and Normalized Mutual Information) of our proposed algorithm compared to other benchmark algorithms. Further analysis reveals that our algorithm boasts a O(nlogn) time-complexity and a O(logn) space-complexity, indicating its scalability and efficiency in handling large-scale data with minimal time and storage costs. Importantly, our algorithm's ability to process up to two million data points on a standard personal computer underscores its cost-effectiveness.

Clustering through decision tree construction

Interpretable fuzzy clustering using unsupervised fuzzy decision trees

Unsupervised clustering algorithm on 3D model library via decision graph

Unsupervised Learning Via An Iteratively Constructed Clustering Ensemble

A Tree-Based Incremental Overlapping Clustering Method Using the Three-Way Decision Theory

Using Decision Trees for Interpretable Supervised Clustering

Unsupervised Deep Discriminant Analysis Based Clustering

A Cluster Tree Method for Text Categorization

Clustering Based on Supervised Learning of Exemplar Discriminative Information

Semi-supervised Hierarchical Clustering Analysis for High Dimensional Data

A Three-Way Decisions Clustering Algorithm for Incomplete Data

Interpretable clustering using unsupervised binary trees

Boosting cluster tree with reciprocal nearest neighbors scoring

A Novel Decision Cluster Classifier with Nested Agglomerative K-Means

Clustering by Constructing Hyper-Planes

On cluster tree for nested and multi-density data clustering

Supervised Convex Clustering

SUDEPHIC: Self-Tuning Density-Based Partitioning and Hierarchical Clustering

A Three-Way Clustering Method Based on Ensemble Strategy and Three-Way Decision

Kernel KMeans clustering splits for end-to-end unsupervised decision trees

DTEC: Decision Tree-Based Evidential Clustering for Interpretable Partition of Uncertain Data.