Leaf Node-Level Ensemble Pruning Approaches Based On Node-Sample Correlation For Random Forest

Xin Liu,Qifeng Zhou,Fan Yang
DOI: https://doi.org/10.1109/IECON.2017.8217016
2017-01-01
Abstract:As a state-of-the-art ensemble method, random forest which exhibits a good ability to predict and generalize on various dataset is often composed of a large number of trees. Redundancy of ensemble and connotative decision rules result in expensive operational costs as well as difficulties in comprehension. In this paper, novel leaf node-level pruning methods for random forest are proposed. Each leaf node extracted from a random forest model is regarded as a singe classifier or classification rule, and is then evaluated for pruning. Different from traditional tree-level pruning and rule pruning methods, the idea is to evaluate and extract rules according to node-sample correlation rather than eliminate trees from the ensemble or integrate rules themselves. Experiment results show that the proposed methods can efficiently reduce the size of rule set, and the resulted rules based ensemble achieves better interpretability without significant loss of accuracy.
What problem does this paper attempt to address?