Abstract:Ensemble learning is an important branch of machine learning, which integrates multiple learners to obtain better learning performance than single learners.It has been widely accepted that the base learners in an ensemble model should be both accurate and diverse to achieve good performance.Among the factors that affect the performance of ensemble learning, diversity and margin have been considered to be two key ones.Most of the existing studies tried to analyze the impact of these two factors separately, and mainly focused on their impacts on the classification or regression error of the ensemble model.AUC is an important criterion for evaluating the classification performance of the learners.It is a pair-wise criterion that is used to evaluate the probability that the positive samples achieve higher scores than the negative ones.However, few studies have focused on the relationship between AUC and diversity or margin.In this paper, we proposed two AUC decomposition theorems based on the Ambiguity Decomposition, which is one of the most important generalization error decomposition theory.Further, we discussed the relationship between generalization error, AUC, diversity and margin.According to our theoretical results, the commonly used margin maximization method not only reduces the empirical error, but also reduces the diversity among the base classifiers, which leads to the problem of overfitting.Similar results also hold in the case of AUC.Due to the reduction in diversity, methods like margin maximization could not achieve satisfactory generalization performance with respect to classification error or AUC.Based on these theoretical results, we proposed two new weight optimization algorithms to combine the base classifiers, and the targets of these two algorithms are classification error and AUC, respectively.Existing weight optimization methods usually suffer from over-fitting problem, and these methods usually use a term that is related to diversity to avoid over-fitting.However, due to the unclear definition of diversity and the difficulty of parameter tuning, such methods usually could not achieve satisfactory performance.Inspired by our theoretical results, in both of our proposed algorithms, we got use of the margin of the ensemble model.Moreover, the objective functions are quadratic functions of the margin, thus the learning procedure can be guaranteed to be convergence.By introducing a trade-off parameter p, we optimized the margin to a proper level instead of maximization.Therefore, we could achieve an optimal balance between accuracy and diversity.Since the parameter pis highly related to the regularization parameter, in practice, we could fix p and determine the regularization parameter using grid search, thus the proposed algorithms are highly applicable.We evaluated our algorithms in 35open datasets.The experimental results confirm that our algorithms are not sensitive to the parameter p.Compared with other commonly used ensemble methods, the proposed algorithms achieve significantly better results in most cases.Both our theoretical and experimental results show that there is a strong connection between diversity and the margin, and through exploiting the relationship between them, the generalization ability of ensemble models could be effectively improved.

Maximizing Diversity by Transformed Ensemble Learning

Optimising Ensemble Combination Based on Maximisation of Diversity

Learning to Diversify via Weighted Kernels for Classifier Ensemble

End-to-End Ensemble Learning by Exploiting the Correlation Between Individuals and Weights.

Diversity-Based Ensemble with Sample Weight Learning

Selective Ensemble Based on Transformation of Classifiers Used Spca

Efficient Diversity-Driven Ensemble for Deep Neural Networks

Weighted Classifier Ensemble Based on Quadratic Form.

Learning to Diversify for Ensembling of An Amount of Classifiers

Ensemble Learning through Diversity Management: Theory, Algorithms, and Applications

A New Rotation Forest Ensemble Algorithm

Developing parsimonious ensembles using ensemble diversity within a reinforcement learning framework

Neural Network Ensembles: Theory, Training, and the Importance of Explicit Diversity

Promoting High Diversity Ensemble Learning with EnsembleBench

Deep Neural Network Ensembles against Deception: Ensemble Diversity, Accuracy and Robustness

Diversity Learning: Introducing the Space-time Scheme to Ensemble Learning

Decomposition Theories of Generalization Error and AUC in Ensemble Learning with Application in Weight Optimization

Classifier Ensemble with Diversity：Effectiveness Analysis and Ensemble Optimization

A Unified Theory of Diversity in Ensemble Learning

Achieving More with Less: A Tensor-Optimization-Powered Ensemble Method

On Diversity and Accuracy of Homogeneous and Heterogeneous Ensembles