Abstract:Out-Of-Distribution (OOD) generalization is an essential topic in machine learning. However, recent research is only focusing on the corresponding methods for neural networks. This paper introduces a novel and effective solution for OOD generalization of decision tree models, named Invariant Decision Tree (IDT). IDT enforces a penalty term with regard to the unstable/varying behavior of a split across different environments during the growth of the tree. Its ensemble version, the Invariant Random Forest (IRF), is constructed. Our proposed method is motivated by a theoretical result under mild conditions, and validated by numerical tests with both synthetic and real datasets. The superior performance compared to non-OOD tree models implies that considering OOD generalization for tree models is absolutely necessary and should be given more attention.

What problem does this paper attempt to address?

The paper primarily addresses a key issue in machine learning—**Out-Of-Distribution (OOD) Generalization**. In many machine learning applications (such as image classification, speech recognition, etc.), models can achieve good performance when the training data and test data come from the same distribution. However, in practical applications, the distribution of test data often differs from that of the training data, which is known as the OOD generalization problem. For example, in autonomous driving tasks, one might encounter unseen signs; in stock market prediction, large-scale crashes might occur due to new reasons, etc. The performance of existing machine learning models can significantly degrade in OOD scenarios. To address this weakness, although various methods have been proposed to improve the OOD generalization ability of neural networks, these methods mainly focus on deep neural networks (DNNs). For other types of machine learning models, such as decision trees, there are currently no corresponding solutions. Therefore, this paper proposes a novel method called **Invariant Decision Tree (IDT)** and its ensemble version **Invariant Random Forest (IRF)**, aiming to address the OOD generalization issue in decision tree models. Specifically, IDT introduces a penalty term during the tree growth process to encourage the use of stable features for splitting, thereby avoiding splits that perform inconsistently across different environments. This penalty term is designed based on theoretical results and its effectiveness is validated through numerical experiments. The paper also details the theoretical basis behind the method, including the concepts of stable and unstable splits, and demonstrates how to distinguish between these two types of splits through a simple binary classification example. Subsequently, the paper proposes an invariant to judge the stability of splitting variables and designs an additional penalty term based on this invariant to improve the traditional splitting criterion. Finally, through experiments on synthetic and real datasets, the paper demonstrates that the proposed IRF method has significant advantages in OOD generalization compared to traditional tree models. The experimental results show that as the weight of the penalty term λ increases, the model tends to use more stable features for splitting, thereby improving the model's generalization ability in OOD scenarios.

Invariant Random Forest: Tree-Based Model Solution for OOD Generalization

Towards a Theoretical Framework of Out-of-Distribution Generalization

Era Splitting -- Invariant Learning for Decision Trees

Decorr: Environment Partitioning for Invariant Learning and OOD Generalization

On the Connection Between Invariant Learning and Adversarial Training for Out-of-Distribution Generalization

On the Benefits of Over-parameterization for Out-of-Distribution Generalization

Time-Series Forecasting for Out-of-Distribution Generalization Using Invariant Learning

Out-of-Distribution Generalization Analysis via Influence Function

Invariant Graph Learning Meets Information Bottleneck for Out-of-Distribution Generalization

InvariantOODG: Learning Invariant Features of Point Clouds for Out-of-Distribution Generalization

Dissecting the Failure of Invariant Learning on Graphs

OoD-Bench: Quantifying and Understanding Two Dimensions of Out-of-Distribution Generalization

Invariant Representation via Decoupling Style and Spurious Features from Images

Towards Out-Of-Distribution Generalization: A Survey

A Survey on Evaluation of Out-of-Distribution Generalization

Out-of-Distribution Optimality of Invariant Risk Minimization

Winning Prize Comes from Losing Tickets: Improve Invariant Learning by Exploring Variant Parameters for Out-of-Distribution Generalization

Fast Decision Boundary based Out-of-Distribution Detector

Unifying Invariance and Spuriousity for Graph Out-of-Distribution via Probability of Necessity and Sufficiency

Model-Agnostic Random Weighting for Out-of-Distribution Generalization

Handling Distribution Shifts on Graphs: An Invariance Perspective