An Algorithmic Framework for Constructing Multiple Decision Trees by Evaluating Their Combination Performance Throughout the Construction Process

Keito Tajima, Naoki Ichijo, Yuta Nakahara, Toshiyasu Matsushima
2024-02-10
Abstract:Predictions using a combination of decision trees are known to be effective in machine learning. Typical ideas for constructing a combination of decision trees for prediction are bagging and boosting. Bagging independently constructs decision trees without evaluating their combination performance and averages them afterward. Boosting constructs decision trees sequentially, only evaluating a combination performance of a new decision tree and the fixed past decision trees at each step. Therefore, neither method directly constructs nor evaluates a combination of decision trees for the final prediction. When the final prediction is based on a combination of decision trees, it is natural to evaluate the appropriateness of the combination when constructing them. In this study, we propose a new algorithmic framework that constructs decision trees simultaneously and evaluates their combination performance throughout the construction process. Our framework repeats two procedures. In the first procedure, we construct new candidates of combinations of decision trees to find a proper combination of decision trees. In the second procedure, we evaluate each combination performance of decision trees under some criteria and select a better combination. To confirm the performance of the proposed framework, we perform experiments on synthetic and benchmark data.
Machine Learning
What problem does this paper attempt to address?
This paper proposes a new algorithmic framework for constructing decision tree ensembles by evaluating the combination performance of multiple decision trees throughout the construction process. Traditional prediction methods such as bagging and boosting do not directly evaluate the ensemble performance when constructing decision trees. The framework outlined in the paper consists of two steps: generating new candidate ensembles of decision trees and evaluating the performance of each combination based on certain criteria to select the best ensemble. This approach allows for simultaneous construction of decision trees while continuously evaluating the predictive performance based on tree ensembles. In bagging, decision trees are independently and parallelly constructed, and the final prediction result is obtained by averaging the individual predictions. On the other hand, in boosting, decision trees are sequentially built and each newly added tree is evaluated in combination with the fixed trees before. However, neither of these methods directly assesses the ensemble performance of the final prediction. The paper demonstrates the effectiveness of evaluating tree ensemble performance throughout the construction process through experiments and compares it with random forest (RF), showing the advantages of this approach. In summary, this paper aims to address the problem of effectively evaluating and utilizing the combination performance of multiple decision trees during prediction construction in order to improve prediction accuracy and efficiency.