Abstract:Summary Clustering ensemble is a popular approach for identifying data clusters that combines the clustering results from multiple base clustering algorithms to produce more accurate and robust data clusters. However, the performance of clustering ensemble algorithms is highly dependent on the quality of clustering members. To address this problem, this paper proposes a member enhancement‐based clustering ensemble (MECE) algorithm that selects the ensemble members by considering their distribution consistency. MECE has two main components, called heterocluster splitting and homocluster merging. The first component estimates two probability density functions (p.d.f.s) estimated on the sample points of an heterocluster and represents them using a Gaussian distribution and a Gaussian mixture model. If the random numbers generated by these two p.d.f.s have different probability distributions, the heterocluster is then split into smaller clusters. The second component merges the clusters that have high neighborhood densities into a homocluster, where the neighborhood density is measured using a novel evaluation criterion. In addition, a co‐association matrix is presented, which serves as a summary for the ensemble of diverse clusters. A series of experiments were conducted to evaluate the feasibility and effectiveness of the proposed ensemble member generation algorithm. Results show that the proposed MECE algorithm can select high quality ensemble members and as a result yield the better clusterings than six state‐of‐the‐art ensemble clustering algorithms, that is, cluster‐based similarity partitioning algorithm (CSPA), meta‐clustering algorithm (MCLA), hybrid bipartite graph formulation (HBGF), evidence accumulation clustering (EAC), locally weighted evidence accumulation (LWEA), and locally weighted graph partition (LWGP). Specifically, MECE algorithm has the nearly 23% higher average NMI, 27% higher average ARI, 15% higher average FMI, and 10% higher average purity than CSPA, MCLA, HBGF, EAC, LWEA, and LWGA algorithms. The experimental results demonstrate that MECE algorithm is a valid approach to deal with the clustering ensemble problems.

Clusterer ensemble

Combining multiple clusterings via k-modes algorithm

A Multi-Task Learning Strategy for Unsupervised Clustering Via Explicitly Separating the Commonality

Clustering Ensemble with High Diversity Based on Adding Artificial Data

Bagging-Based Selective Clusterer Ensemble

Clustering ensemble algorithm with high-order consistency learning

Ensemble Clustering Based on Meta-Learning and Hyperparameter Optimization

k-HyperEdge Medoids for Clustering Ensemble

Probabilistic Cluster Structure Ensemble

A novel member enhancement‐based clustering ensemble algorithm

Multiple clustering and selecting algorithms with combining strategy for selective clustering ensemble

Sequential Combination Methods for Data Clustering Analysis

Self-Paced Clustering Ensemble

Clustering Ensemble Approaches: an Overview

Active Clustering Ensemble With Self-Paced Learning

Clustering Combination Method

Spectral Aggregation for Clustering Ensemble

Solving multi-instance problems with classifier ensemble based on constructive clustering

Clustering Ensemble Meets Low-rank Tensor Approximation

Knowledge Based Cluster Ensemble for Cancer Discovery from Biomolecular Data

Soft Cluster Ensemble Based on Fuzzy Similarity Measure