Abstract:For classification and regression on tabular data, the dominance of gradient-boosted decision trees (GBDTs) has recently been challenged by often much slower deep learning methods with extensive hyperparameter tuning. We address this discrepancy by introducing (a) RealMLP, an improved multilayer perceptron (MLP), and (b) strong meta-tuned default parameters for GBDTs and RealMLP. We tune RealMLP and the default parameters on a meta-train benchmark with 118 datasets and compare them to hyperparameter-optimized versions on a disjoint meta-test benchmark with 90 datasets, as well as the GBDT-friendly benchmark by Grinsztajn et al. (2022). Our benchmark results on medium-to-large tabular datasets (1K--500K samples) show that RealMLP offers a favorable time-accuracy tradeoff compared to other neural baselines and is competitive with GBDTs in terms of benchmark scores. Moreover, a combination of RealMLP and GBDTs with improved default parameters can achieve excellent results without hyperparameter tuning. Finally, we demonstrate that some of RealMLP's improvements can also considerably improve the performance of TabR with default parameters.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is that in classification and regression tasks on tabular data, although deep - learning methods have potential in terms of accuracy, they usually require a large amount of hyper - parameter tuning, which leads to their inefficiency in practical applications. In contrast, Gradient - Boosting Decision Trees (GBDTs) perform well on these tasks, but their advantages are being challenged by deep - learning methods. For this reason, the paper makes the following two main contributions: 1. **Introduction of RealMLP**: This is an improved Multi - Layer Perceptron (MLP). Through a series of optimization techniques and better default parameter settings, its performance on tabular data is improved. These optimizations include, but are not limited to, pre - processing using robust scaling and smooth clipping, new numerical embedding variants, diagonal weight layers, new scheduling methods, and different initialization methods, etc. 2. **Proposing default parameters for strong meta - tuning**: These are for not only RealMLP but also GBDTs. These parameters can achieve excellent performance without being tuned on individual datasets. The author verifies the effectiveness of these default parameters by tuning these models on a meta - training benchmark containing 118 datasets and evaluating them on an independent meta - testing benchmark containing 90 datasets. The goal of the paper is to make neural - network - based methods less dependent on hyper - parameter tuning without sacrificing too much performance through these improvements, thereby improving their practicality and efficiency. In addition, the paper also explores the selection and integration strategies of different models and shows how to use these improved default parameters to achieve a better time - performance trade - off.

Better by Default: Strong Pre-Tuned MLPs and Boosted Trees on Tabular Data

Well-tuned Simple Nets Excel on Tabular Datasets

Team up GBDTs and DNNs: Advancing Efficient and Effective Tabular Prediction with Tree-hybrid MLPs

Meta-Learning to Improve Pre-Training

TabM: Advancing Tabular Deep Learning with Parameter-Efficient Ensembling

Benchmarking and Optimization of Gradient Boosting Decision Tree Algorithms

Pretrained deep models outperform GBDTs in Learning-To-Rank under label scarcity

SigOpt Mulch: An Intelligent System for AutoML of Gradient Boosted Trees

TabReD: Analyzing Pitfalls and Filling the Gaps in Tabular Deep Learning Benchmarks

A New Optimization Model for MLP Hyperparameter Tuning: Modeling and Resolution by Real-Coded Genetic Algorithm

When Do Neural Nets Outperform Boosted Trees on Tabular Data?

BBTv2: Towards a Gradient-Free Future with Large Language Models

Gradient Boosting Reinforcement Learning

DeepGBM

HyperTuning: Toward Adapting Large Language Models without Back-propagation

Generalized Population-Based Training for Hyperparameter Optimization in Reinforcement Learning

A Simple and Fast Baseline for Tuning Large XGBoost Models

Understanding Transfer Learning and Gradient-Based Meta-Learning Techniques

Tabular Data: Is Attention All You Need?

Gradient Boosting Trees and Large Language Models for Tabular Data Few-Shot Learning

Hyperparameter Tuning MLPs for Probabilistic Time Series Forecasting