Promoting High Diversity Ensemble Learning with EnsembleBench

Yanzhao Wu,Ling Liu,Zhongwei Xie,Juhyun Bae,Ka-Ho Chow,Wenqi Wei
DOI: https://doi.org/10.48550/arXiv.2010.10623
2020-10-21
Abstract:Ensemble learning is gaining renewed interests in recent years. This paper presents EnsembleBench, a holistic framework for evaluating and recommending high diversity and high accuracy ensembles. The design of EnsembleBench offers three novel features: (1) EnsembleBench introduces a set of quantitative metrics for assessing the quality of ensembles and for comparing alternative ensembles constructed for the same learning tasks. (2) EnsembleBench implements a suite of baseline diversity metrics and optimized diversity metrics for identifying and selecting ensembles with high diversity and high quality, making it an effective framework for benchmarking, evaluating and recommending high diversity model ensembles. (3) Four representative ensemble consensus methods are provided in the first release of EnsembleBench, enabling empirical study on the impact of consensus methods on ensemble accuracy. A comprehensive experimental evaluation on popular benchmark datasets demonstrates the utility and effectiveness of EnsembleBench for promoting high diversity ensembles and boosting the overall performance of selected ensembles.
Machine Learning
What problem does this paper attempt to address?
The paper attempts to address the problem of how to evaluate and recommend ensemble models with high diversity and high accuracy. Specifically, the paper focuses on the following aspects: 1. **Quantitative evaluation of ensemble model quality**: Existing research lacks clear standards on what constitutes a good ensemble model and how to select one. The paper proposes a set of quantitative metrics to evaluate the quality of ensemble models and compare the advantages and disadvantages of different ensemble models. 2. **Improving the diversity of ensemble models**: Highly diverse ensemble models have high failure independence and low negative correlation, which helps improve overall prediction performance. The paper introduces a series of baseline diversity and optimized diversity metrics to identify and select highly diverse ensemble models. 3. **Benchmarking and recommending highly diverse ensemble models**: The paper provides a framework for benchmarking, evaluating, and recommending highly diverse ensemble models. By implementing various baseline diversity and optimized diversity metrics, the framework can effectively identify high-quality ensemble models. 4. **Impact of consensus methods**: The paper also provides four representative ensemble consensus methods, including model averaging, majority voting, plurality voting, and learning combination. These methods are used in experiments to evaluate their impact on the accuracy of ensemble models. In summary, the main goal of the paper is to address the problem of how to evaluate and recommend ensemble models with high diversity and high accuracy by proposing a comprehensive framework (EnsembleBench). This framework not only provides quantitative evaluation metrics but also implements various diversity and consensus methods to improve the overall performance of ensemble models.