Explaining Random Forests As Single Decision Trees Through Distance Functional Optimization

Zihao Li,Xiao Du,Tieru Wu,Yang Cao
DOI: https://doi.org/10.1109/ijcnn60899.2024.10650261
2024-01-01
Abstract:Random Forests are renowned for their good performance on tabular data and are widely used in many real-life tasks. However, the ensemble mechanism sacrifices interpretability for higher generalization ability, which limits the use of Random Forests in some application domains. Against this background, this paper proposes a novel method to explain Random Forests as interpretable Decision Trees by optimizing the distance between the original model and their explanations, to ensure that the explanations are generated under the supervision of the original models themselves without loss of information. The resulting Decision Trees utilized to explain the original forests can be regarded as approximations of the original forests and are referred to as APtree explanations. To generate APtree explanations, we derive an explicit discrete scheme of the distance functional between Random Forests and Decision Trees by leveraging the properties of Simple Function in Real Analysis theory. Based on this distance functional, we design an Explainability Gain Function to construct APtree explanations. Furthermore, we conduct research on the properties of the designed distance functional and propose a proposition which can reduce the computational complexity of generating explanations theoretically. We evaluate effectiveness of APtree explanations on several datasets and demonstrate that they can be used to explain Random Forests while exhibiting powerful classification ability comparable to the original forests. Our code is available at https://github.com/Zihaolll/APtree.
What problem does this paper attempt to address?