Abstract:Summary Clustering ensemble is a popular approach for identifying data clusters that combines the clustering results from multiple base clustering algorithms to produce more accurate and robust data clusters. However, the performance of clustering ensemble algorithms is highly dependent on the quality of clustering members. To address this problem, this paper proposes a member enhancement‐based clustering ensemble (MECE) algorithm that selects the ensemble members by considering their distribution consistency. MECE has two main components, called heterocluster splitting and homocluster merging. The first component estimates two probability density functions (p.d.f.s) estimated on the sample points of an heterocluster and represents them using a Gaussian distribution and a Gaussian mixture model. If the random numbers generated by these two p.d.f.s have different probability distributions, the heterocluster is then split into smaller clusters. The second component merges the clusters that have high neighborhood densities into a homocluster, where the neighborhood density is measured using a novel evaluation criterion. In addition, a co‐association matrix is presented, which serves as a summary for the ensemble of diverse clusters. A series of experiments were conducted to evaluate the feasibility and effectiveness of the proposed ensemble member generation algorithm. Results show that the proposed MECE algorithm can select high quality ensemble members and as a result yield the better clusterings than six state‐of‐the‐art ensemble clustering algorithms, that is, cluster‐based similarity partitioning algorithm (CSPA), meta‐clustering algorithm (MCLA), hybrid bipartite graph formulation (HBGF), evidence accumulation clustering (EAC), locally weighted evidence accumulation (LWEA), and locally weighted graph partition (LWGP). Specifically, MECE algorithm has the nearly 23% higher average NMI, 27% higher average ARI, 15% higher average FMI, and 10% higher average purity than CSPA, MCLA, HBGF, EAC, LWEA, and LWGA algorithms. The experimental results demonstrate that MECE algorithm is a valid approach to deal with the clustering ensemble problems.

A H-K clustering algorithm based on ensemble learning

Combining multiple clusterings via k-modes algorithm

Clustering Ensemble with High Diversity Based on Adding Artificial Data

An Ensemble Hierarchical Clustering Algorithm Based on Merits at Cluster and Partition Levels

Knowledge Based Cluster Ensemble for Cancer Discovery from Biomolecular Data

Clustering ensemble algorithm with high-order consistency learning

A novel member enhancement‐based clustering ensemble algorithm

Enhancing Ensemble Clustering with Adaptive High-Order Topological Weights

Subspace Clustering by Directly Solving Discriminative K-means

Ensemble Clustering Based on Meta-Learning and Hyperparameter Optimization

An adaptive highly improving the accuracy of clustering algorithm based on kernel density estimation

k-HyperEdge Medoids for Clustering Ensemble

A Multi-disciplinary Ensemble Algorithm for Clustering Heterogeneous Datasets

An Improved Affinity Propagation Clustering Algorithm Based on Entropy Weight Method and Principal Component Analysis

A Point-Cluster-Partition Architecture for Weighted Clustering Ensemble

Developing ensemble clustering through similarity measures: A semi‐supervised hierarchical clustering learning

Clusterer ensemble

The Impact of Isolation Kernel on Agglomerative Hierarchical Clustering Algorithms

HCDC: A novel hierarchical clustering algorithm based on density-distance cores for data sets with varying density

PCS-granularity weighted ensemble clustering via Co-association matrix

K*-Means: An Efficient Clustering Algorithm with Adaptive Decision Boundaries