Ensemble clustering using factor graph

Dong Huang,Jianhuang Lai,Chang-Dong Wang
DOI: https://doi.org/10.1016/j.patcog.2015.08.015
IF: 8
2016-01-01
Pattern Recognition
Abstract:In this paper, we propose a new ensemble clustering approach termed ensemble clustering using factor graph (ECFG). Compared to the existing approaches, our approach has three main advantages: (1) the cluster number is obtained automatically and need not to be specified in advance; (2) the reliability of each base clustering can be estimated in an unsupervised manner and exploited in the consensus process; (3) our approach is efficient for processing ensembles with large data sizes and large ensemble sizes. In this paper, we introduce the concept of super-object, which serves as a compact and adaptive representation for the ensemble data and significantly facilitates the computation. Through the probabilistic formulation, we cast the ensemble clustering problem into a binary linear programming (BLP) problem. The BLP problem is NP-hard. To solve this optimization problem, we propose an efficient solver based on factor graph. The constrained objective function is represented as a factor graph and the max-product belief propagation is utilized to generate the solution insensitive to initialization and converged to the neighborhood maximum. Extensive experiments are conducted on multiple real-world datasets, which demonstrate the effectiveness and efficiency of our approach against the state-of-the-art approaches. HighlightsIntroduce the super-object representation to facilitate the consensus process.Probabilistically formulate the ensemble clustering problem into a BLP problem.Propose an efficient solver for the BLP problem based on factor graph.The cluster number of the consensus clustering is estimated automatically.Our method achieves the state-of-the-art performance in effectiveness and efficiency.
What problem does this paper attempt to address?