Clustering Ensemble with High Diversity Based on Adding Artificial Data

Hui-Lan LUO,Fan-Sheng KONG,Yi-Xiao LI
DOI: https://doi.org/10.3969/j.issn.1003-6059.2008.05.018
2008-01-01
Abstract:Ensemble diversity is considered as a key factor in ensemble learning. There are many methods for constructing clustering collection or ensemble, but a few of them focus on the production of high ensemble diversity. Two methods are proposed for generating clustering ensembles with high diversity-constructing clustering ensemble by adding noise (CEAN) and improved CEAN (ICEAN). By adding artificial data, they can obtain clustering ensembles with high diversity. Compared with other commonly used methods for generating clustering ensembles, CEAN and ICEAN increase the ensemble diversity, and thus they get better clustering integration results with the same average ensemble member accuracy.
What problem does this paper attempt to address?