Abstract:A key element in solving real-life data science problems is selecting the types of models to use. Tree ensemble models (such as XGBoost) are usually recommended for classification and regression problems with tabular data. However, several deep learning models for tabular data have recently been proposed, claiming to outperform XGBoost for some use cases. This paper explores whether these deep models should be a recommended option for tabular data by rigorously comparing the new deep models to XGBoost on various datasets. In addition to systematically comparing their performance, we consider the tuning and computation they require. Our study shows that XGBoost outperforms these deep models across the datasets, including the datasets used in the papers that proposed the deep models. We also demonstrate that XGBoost requires much less tuning. On the positive side, we show that an ensemble of deep models and XGBoost performs better on these datasets than XGBoost alone.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: Should the recently proposed deep - learning models for tabular data be the recommended choice? Specifically, this research aims to explore the following two aspects: 1. **Model accuracy**: Are these new deep - learning models more accurate than the existing classical models (such as XGBoost), especially on those datasets that do not appear in the original papers? 2. **Time cost of training and hyper - parameter search**: How long does it take to train and optimize the hyper - parameters of these deep - learning models compared with XGBoost? ### Background Traditionally, gradient - boosted decision tree (GBDT) models such as XGBoost have been widely recommended for their superior performance on tabular data. However, in recent years, some studies have proposed deep - learning models for tabular data and claimed that these models can outperform XGBoost in some cases. However, due to the lack of standard benchmark datasets, it is difficult to compare these models, and the degree of model optimization in different studies varies, resulting in unclear conclusions. ### Research purpose The main purpose of this study is to evaluate whether these deep - learning models should be the recommended choice for tabular data problems by systematically comparing the performance of these newly proposed deep - learning models with XGBoost on multiple datasets, as well as the parameter - tuning and computing resources they require. ### Methods 1. **Dataset selection**: The study used 11 different tabular datasets, 9 of which were from previous studies and 2 from Kaggle competitions. 2. **Experimental setup**: All models were trained and evaluated using the same parameter - tuning protocol. The researchers used the Bayesian optimization method (HyperOpt) to optimize the hyper - parameters of each model. 3. **Performance evaluation**: For binary classification problems, cross - entropy loss was used; for regression problems, the root - mean - square error (RMSE) was used. Each configuration was experimented four times, and the average performance and standard error on the test set were reported. ### Main findings 1. **Model generalization ability**: - Deep - learning models generally perform worse than XGBoost on unseen datasets. XGBoost outperforms deep - learning models on 8 out of 11 datasets, and the difference is significant (p < 0.005). - Each deep - learning model performs best on the dataset used in its original paper, but its performance drops significantly on other datasets. 2. **Model integration**: - The integration of deep - learning models and XGBoost performs best on most datasets. On 7 out of 11 datasets, this integrated model significantly outperforms a single deep - learning model (p < 0.005). - The integration of deep - learning models alone or classical models alone has a poorer effect. 3. **Optimization difficulty**: - The hyper - parameter search process of XGBoost is much shorter than that of deep - learning models. - Deep - learning models require more computing resources in training and parameter - tuning. ### Conclusion Although deep - learning models perform well on some specific datasets, overall, XGBoost is still the recommended choice for tabular data problems. In addition, the integration method of combining XGBoost with deep - learning models can further improve performance. However, deep - learning models require more computing resources in training and parameter - tuning, which may be a limiting factor in practical applications. Therefore, the study believes that current deep - learning is not the only choice for solving tabular data problems.

Tabular Data: Deep Learning is Not All You Need

A Comprehensive Benchmark of Machine and Deep Learning Across Diverse Tabular Datasets

Tabular Data: Is Attention All You Need?

When Do Neural Nets Outperform Boosted Trees on Tabular Data?

A Closer Look at Deep Learning Methods on Tabular Datasets

Can a Deep Learning Model Be a Sure Bet for Tabular Prediction?

Revisiting Deep Learning Models for Tabular Data

ExcelFormer: A neural network surpassing GBDTs on tabular data

TabReD: Analyzing Pitfalls and Filling the Gaps in Tabular Deep Learning Benchmarks

Transfer Learning with Deep Tabular Models

Squeezing Lemons with Hammers: An Evaluation of AutoML and Tabular Deep Learning for Data-Scarce Classification Applications

A Survey on Deep Tabular Learning

TabR: Tabular Deep Learning Meets Nearest Neighbors in 2023

Tabular deep learning: a comparative study applied to multi-task genome-wide prediction

A Federated Learning Benchmark on Tabular Data: Comparing Tree-Based Models and Neural Networks

Gradient Boosting Decision Trees on Medical Diagnosis over Tabular Data

When are Deep Networks really better than Decision Forests at small sample sizes, and how?

HyperTab: Hypernetwork Approach for Deep Learning on Small Tabular Datasets

XBNet : An Extremely Boosted Neural Network

Pretrained deep models outperform GBDTs in Learning-To-Rank under label scarcity

Well-tuned Simple Nets Excel on Tabular Datasets