Dual-granularity Weighted Ensemble Clustering

Li Xu,Shifei Ding
DOI: https://doi.org/10.1016/j.knosys.2021.107124
IF: 8.139
2021-01-01
Knowledge-Based Systems
Abstract:Ensemble clustering is one of the research hotspots of data mining in recent years. The selection of high-quality and large-diversity base clustering results plays a key role in the quality of the final result. Traditional ensemble clustering selection algorithms usually treat each base clustering result as a whole which ignores the difference between the clusters in the same clustering result. It may cause the validity of the final clustering result to be affected. Aiming at this problem, inspired by the measurement method of uncertainty in the rough set theory, a dual-granularity weighted ensemble clustering model is proposed. The main contribution of this paper is shown as follows: (1) the evaluation of the reliability of clusters is transformed into an uncertainty measurement problem in the rough set; (2) in a finer-grained level, a sample local similarity measurement method is designed; (3) a weighted co-association matrix elements generation method based on global cluster reliability and local sample pair similarity is proposed, then the fusion function is used to get the final clustering result. Experimental results show that the proposed method is not sensitive to the size and diversity of base clustering members which has good robustness and stability. The final result obtained by this model is closer to the actual distribution of data sets.
What problem does this paper attempt to address?