AdaDT: an Adaptive Decision Tree for Addressing Local Class Imbalance Based on Multiple Split Criteria

Yan Jianjian,Zhang Zhongnan,Dong Huailin
DOI: https://doi.org/10.1007/s10489-020-02061-z
IF: 5.3
2021-01-01
Applied Intelligence
Abstract:As it is well known, decision tree is a kind of data-driven classification model, and its primary core is the split criterion. Although a great deal of split criteria have been proposed so far, almost all of them focus on the global class distribution of the training data. However, they ignored the local class imbalance problem that commonly appears during the decision tree induction over balanced or roughly balanced binary class data sets. In the present study, this problem is investigated in detail and an adaptive approach based on multiple existing split criteria is proposed. In the proposed scheme, the local class imbalanced ratio is considered as the weight factor to weigh the importance between these split criteria so as to determine the optimal splitting point at each internal node. In order to evaluate the effectiveness of the proposed method, it is applied on twenty roughly balanced real-world binary class data sets. Experimental results show that the proposed method not only outperforms all other methods, but also improves the prediction accuracy of each class.
What problem does this paper attempt to address?