Decomposition Theories of Generalization Error and AUC in Ensemble Learning with Application in Weight Optimization
Zheng-Shen JIANG,Hong-Zhi LIU,Bin FU,Zhong-Hai WU
DOI: https://doi.org/10.11897/SP.J.1016.2019.00001
2019-01-01
Abstract:Ensemble learning is an important branch of machine learning, which integrates multiple learners to obtain better learning performance than single learners.It has been widely accepted that the base learners in an ensemble model should be both accurate and diverse to achieve good performance.Among the factors that affect the performance of ensemble learning, diversity and margin have been considered to be two key ones.Most of the existing studies tried to analyze the impact of these two factors separately, and mainly focused on their impacts on the classification or regression error of the ensemble model.AUC is an important criterion for evaluating the classification performance of the learners.It is a pair-wise criterion that is used to evaluate the probability that the positive samples achieve higher scores than the negative ones.However, few studies have focused on the relationship between AUC and diversity or margin.In this paper, we proposed two AUC decomposition theorems based on the Ambiguity Decomposition, which is one of the most important generalization error decomposition theory.Further, we discussed the relationship between generalization error, AUC, diversity and margin.According to our theoretical results, the commonly used margin maximization method not only reduces the empirical error, but also reduces the diversity among the base classifiers, which leads to the problem of overfitting.Similar results also hold in the case of AUC.Due to the reduction in diversity, methods like margin maximization could not achieve satisfactory generalization performance with respect to classification error or AUC.Based on these theoretical results, we proposed two new weight optimization algorithms to combine the base classifiers, and the targets of these two algorithms are classification error and AUC, respectively.Existing weight optimization methods usually suffer from over-fitting problem, and these methods usually use a term that is related to diversity to avoid over-fitting.However, due to the unclear definition of diversity and the difficulty of parameter tuning, such methods usually could not achieve satisfactory performance.Inspired by our theoretical results, in both of our proposed algorithms, we got use of the margin of the ensemble model.Moreover, the objective functions are quadratic functions of the margin, thus the learning procedure can be guaranteed to be convergence.By introducing a trade-off parameter p, we optimized the margin to a proper level instead of maximization.Therefore, we could achieve an optimal balance between accuracy and diversity.Since the parameter pis highly related to the regularization parameter, in practice, we could fix p and determine the regularization parameter using grid search, thus the proposed algorithms are highly applicable.We evaluated our algorithms in 35open datasets.The experimental results confirm that our algorithms are not sensitive to the parameter p.Compared with other commonly used ensemble methods, the proposed algorithms achieve significantly better results in most cases.Both our theoretical and experimental results show that there is a strong connection between diversity and the margin, and through exploiting the relationship between them, the generalization ability of ensemble models could be effectively improved.