AutoML-GPT: Automatic Machine Learning with GPT

Shujian Zhang,Chengyue Gong,Lemeng Wu,Xingchao Liu,Mingyuan Zhou
2023-05-04
Abstract:AI tasks encompass a wide range of domains and fields. While numerous AI models have been designed for specific tasks and applications, they often require considerable human efforts in finding the right model architecture, optimization algorithm, and hyperparameters. Recent advances in large language models (LLMs) like ChatGPT show remarkable capabilities in various aspects of reasoning, comprehension, and interaction. Consequently, we propose developing task-oriented prompts and automatically utilizing LLMs to automate the training pipeline. To implement this concept, we present the AutoML-GPT, which employs GPT as the bridge to diverse AI models and dynamically trains models with optimized hyperparameters. AutoML-GPT dynamically takes user requests from the model and data cards and composes the corresponding prompt paragraph. Ultimately, with this prompt paragraph, AutoML-GPT will automatically conduct the experiments from data processing to model architecture, hyperparameter tuning, and predicted training log. By leveraging {\ours}'s robust language capabilities and the available AI models, AutoML-GPT can tackle numerous intricate AI tasks across various tasks and datasets. This approach achieves remarkable results in computer vision, natural language processing, and other challenging areas. Extensive experiments and ablation studies demonstrate that our method can be general, effective, and beneficial for many AI tasks.
Computation and Language,Artificial Intelligence,Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The paper aims to address a key issue in the field of Automated Machine Learning (AutoML): how to leverage Large Language Models (LLMs) to automate the entire process of machine learning tasks, including data processing, model architecture design, hyperparameter tuning, and training log generation. Specifically, the authors propose a new framework called AutoML-GPT, which utilizes the capabilities of large language models like GPT to automatically perform these tasks. In AutoML-GPT, users can describe specific datasets and model requirements by providing Data Cards and Model Cards. This information is integrated into a structured prompt input paragraph, which is then used by the LLM to automatically execute the corresponding machine learning tasks based on these inputs. In this way, AutoML-GPT can automatically perform data preprocessing, select appropriate model architectures, adjust hyperparameters, and generate predicted training logs, thereby significantly reducing the need for human intervention. The paper demonstrates the application of AutoML-GPT in various fields such as computer vision and natural language processing, proving its effectiveness in handling unseen datasets. For example, in object detection tasks, AutoML-GPT can recommend suitable hyperparameter configurations; in question-answering systems, it can adjust the model according to user requirements to achieve faster inference speeds while maintaining performance without significant degradation. Additionally, the paper evaluates the performance of AutoML-GPT in classification tasks, verifying its generality and effectiveness across different datasets and task types.