Large Language Model Enhanced Machine Learning Estimators for Classification

Yuhang Wu,Yingfei Wang,Chu Wang,Zeyu Zheng
2024-05-09
Abstract:Pre-trained large language models (LLM) have emerged as a powerful tool for simulating various scenarios and generating output given specific instructions and multimodal input. In this work, we analyze the specific use of LLM to enhance a classical supervised machine learning method for classification problems. We propose a few approaches to integrate LLM into a classical machine learning estimator to further enhance the prediction performance. We examine the performance of the proposed approaches through both standard supervised learning binary classification tasks, and a transfer learning task where the test data observe distribution changes compared to the training data. Numerical experiments using four publicly available datasets are conducted and suggest that using LLM to enhance classical machine learning estimators can provide significant improvement on prediction performance.
Machine Learning
What problem does this paper attempt to address?
This paper discusses how to use large language models (LLMs) to enhance the performance of traditional machine learning classifiers. Several methods that combine LLMs with classical machine learning approaches are proposed to improve prediction accuracy. Specifically, these methods include: 1. Linear combination method: By weighted linearly combining the predictions of LLM and the machine learning (ML) model, particularly when the ML model is uncertain about boundary data, rely more on the predictions of LLM. 2. LLM predictions as additional information: Incorporating the predictions of LLM as contextual information into model calibration to enhance the performance of classical machine learning models. 3. Transfer learning task: Utilizing the labels generated by LLM to augment the training data in transfer learning tasks with distributional changes, thereby improving the performance of machine learning models on new distributions. In the experimental section, the paper demonstrates the superiority of these methods in tasks such as relevance prediction, sentiment recognition, and hate speech detection using four publicly available datasets. It shows that machine learning models combining LLMs outperform using LLMs or machine learning models alone in terms of predictive performance. In summary, the paper attempts to address how to effectively integrate LLMs to enhance the performance of machine learning algorithms in classification tasks, particularly when dealing with distributional changes and boundary cases, as well as how to utilize LLMs for transfer learning to adapt to new data distributions.