Invariant Random Forest: Tree-Based Model Solution for OOD Generalization

Yufan Liao,Qi Wu,Xing Yan
2024-01-18
Abstract:Out-Of-Distribution (OOD) generalization is an essential topic in machine learning. However, recent research is only focusing on the corresponding methods for neural networks. This paper introduces a novel and effective solution for OOD generalization of decision tree models, named Invariant Decision Tree (IDT). IDT enforces a penalty term with regard to the unstable/varying behavior of a split across different environments during the growth of the tree. Its ensemble version, the Invariant Random Forest (IRF), is constructed. Our proposed method is motivated by a theoretical result under mild conditions, and validated by numerical tests with both synthetic and real datasets. The superior performance compared to non-OOD tree models implies that considering OOD generalization for tree models is absolutely necessary and should be given more attention.
Machine Learning
What problem does this paper attempt to address?
The paper primarily addresses a key issue in machine learning—**Out-Of-Distribution (OOD) Generalization**. In many machine learning applications (such as image classification, speech recognition, etc.), models can achieve good performance when the training data and test data come from the same distribution. However, in practical applications, the distribution of test data often differs from that of the training data, which is known as the OOD generalization problem. For example, in autonomous driving tasks, one might encounter unseen signs; in stock market prediction, large-scale crashes might occur due to new reasons, etc. The performance of existing machine learning models can significantly degrade in OOD scenarios. To address this weakness, although various methods have been proposed to improve the OOD generalization ability of neural networks, these methods mainly focus on deep neural networks (DNNs). For other types of machine learning models, such as decision trees, there are currently no corresponding solutions. Therefore, this paper proposes a novel method called **Invariant Decision Tree (IDT)** and its ensemble version **Invariant Random Forest (IRF)**, aiming to address the OOD generalization issue in decision tree models. Specifically, IDT introduces a penalty term during the tree growth process to encourage the use of stable features for splitting, thereby avoiding splits that perform inconsistently across different environments. This penalty term is designed based on theoretical results and its effectiveness is validated through numerical experiments. The paper also details the theoretical basis behind the method, including the concepts of stable and unstable splits, and demonstrates how to distinguish between these two types of splits through a simple binary classification example. Subsequently, the paper proposes an invariant to judge the stability of splitting variables and designs an additional penalty term based on this invariant to improve the traditional splitting criterion. Finally, through experiments on synthetic and real datasets, the paper demonstrates that the proposed IRF method has significant advantages in OOD generalization compared to traditional tree models. The experimental results show that as the weight of the penalty term λ increases, the model tends to use more stable features for splitting, thereby improving the model's generalization ability in OOD scenarios.