Abstract:Large Language Models (LLM) have brought numerous of new applications to Machine Learning (ML). In the context of tabular data (TD), recent studies show that TabLLM is a very powerful mechanism for few-shot-learning (FSL) applications, even if gradient boosting decisions trees (GBDT) have historically dominated the TD field. In this work we demonstrate that although LLMs are a viable alternative, the evidence suggests that baselines used to gauge performance can be improved. We replicated public benchmarks and our methodology improves LightGBM by 290%, this is mainly driven by forcing node splitting with few samples, a critical step in FSL with GBDT. Our results show an advantage to TabLLM for 8 or fewer shots, but as the number of samples increases GBDT provides competitive performance at a fraction of runtime. For other real-life applications with vast number of samples, we found FSL still useful to improve model diversity, and when combined with ExtraTrees it provides strong resilience to overfitting, our proposal was validated in a ML competition setting ranking first place.

What problem does this paper attempt to address?

### Problems the paper attempts to solve This paper mainly explores the performance comparison between Gradient Boosting Decision Trees (GBDT) and Large Language Models (LLM) on Tabular Data (TD) in the Few - Shot Learning (FSL) scenario, and attempts to improve the baseline performance of GBDT. Specifically, the paper aims to solve the following problems: 1. **Poor performance of GBDT in FSL**: - Previous studies have shown that LLM performs well in few - shot learning, while GBDT has poor performance in the case of very few samples. By adjusting the key parameters of GBDT (such as `min_data_in_leaf`), the paper significantly improves its performance in FSL tasks, enabling GBDT to better adapt to the few - shot learning scenario. 2. **Establishing a fair baseline comparison**: - Since LLM may have a memory effect on certain datasets, resulting in excellent performance on certain tasks, but not a true few - shot learning ability. Therefore, the paper emphasizes the need to establish a fair baseline to ensure the fairness of performance evaluation. By optimizing the parameter configuration of GBDT, the author proves that GBDT can compete with LLM after appropriate adjustment, and even outperform LLM in some cases. 3. **Exploring the performance of different models under different sample sizes**: - The paper analyzes the performance changes of GBDT and LLM as the number of samples increases. The results show that with very few samples (such as 4 - 8 samples), LLM has an advantage; but when the number of samples increases, GBDT not only provides competitive performance, but also has a shorter running time. 4. **The value of FSL in practical applications**: - The paper also shows the value of FSL in practical applications, especially when dealing with large - scale data. Through FSL, model diversity can be increased and robustness to over - fitting can be enhanced. For example, in the FedCSIS 2024 Data Science Challenge, the author used the FSL strategy to build multiple orthogonal models and finally won the first place. ### Summary By optimizing the parameter configuration of GBDT, this paper significantly improves its performance in few - shot learning and makes a fair comparison with LLM. The research results show that GBDT can perform well in few - shot learning tasks after appropriate adjustment, especially when the number of samples increases, GBDT not only has superior performance, but also has higher computational efficiency. In addition, the paper also emphasizes the importance and potential of FSL in practical applications.

Gradient Boosting Trees and Large Language Models for Tabular Data Few-Shot Learning

TabLLM: Few-shot Classification of Tabular Data with Large Language Models

Large Scale Transfer Learning for Tabular Data via Language Modeling

From Supervised to Generative: A Novel Paradigm for Tabular Deep Learning with Large Language Models

Unleashing the Potential of Large Language Models for Predictive Tabular Tasks in Data Science

Generating Realistic Tabular Data with Large Language Models

Large Language Models Can Automatically Engineer Features for Few-Shot Tabular Learning

UniPredict: Large Language Models are Universal Tabular Classifiers

Towards Better Serialization of Tabular Data for Few-shot Classification with Large Language Models

Making Pre-trained Language Models Great on Tabular Prediction

A Federated Learning Benchmark on Tabular Data: Comparing Tree-Based Models and Neural Networks

Improving LLM Group Fairness on Tabular Data via In-Context Learning

Transfer Learning with Deep Tabular Models

Gradient Boosting With Piece-Wise Linear Regression Trees

Rethinking Tabular Data Understanding with Large Language Models

Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding -- A Survey

"Oh LLM, I'm Asking Thee, Please Give Me a Decision Tree": Zero-Shot Decision Tree Induction and Embedding with Large Language Models

Few-shot Adaptation Works with UnpredicTable Data

Towards Foundation Models for Learning on Tabular Data

Incorporating LLM Priors into Tabular Learners

Table Meets LLM: Can Large Language Models Understand Structured Table Data? A Benchmark and Empirical Study