Large Language Models Are Zero-Shot Text Classifiers

Zhiqiang Wang,Yiran Pang,Yanbin Lin

2023-12-02

Abstract:Retrained large language models (LLMs) have become extensively used across various sub-disciplines of natural language processing (NLP). In NLP, text classification problems have garnered considerable focus, but still faced with some limitations related to expensive computational cost, time consumption, and robust performance to unseen classes. With the proposal of chain of thought prompting (CoT), LLMs can be implemented using zero-shot learning (ZSL) with the step by step reasoning prompts, instead of conventional question and answer formats. The zero-shot LLMs in the text classification problems can alleviate these limitations by directly utilizing pretrained models to predict both seen and unseen classes. Our research primarily validates the capability of GPT models in text classification. We focus on effectively utilizing prompt strategies to various text classification scenarios. Besides, we compare the performance of zero shot LLMs with other state of the art text classification methods, including traditional machine learning methods, deep learning methods, and ZSL methods. Experimental results demonstrate that the performance of LLMs underscores their effectiveness as zero-shot text classifiers in three of the four datasets analyzed. The proficiency is especially advantageous for small businesses or teams that may not have extensive knowledge in text classification.

Computation and Language

What problem does this paper attempt to address?

The paper primarily aims to address several key issues in text classification, particularly in the application of Zero-Shot Learning (ZSL). Specifically: 1. **Reducing computational cost and time consumption**: Traditional machine learning and deep learning methods require a large amount of labeled data for training when handling text classification tasks, which is not only time-consuming but also computationally expensive. The paper proposes using pre-trained large language models (LLMs), such as the GPT model, to directly perform classification predictions without the need for additional labeled data. 2. **Improving robustness to unseen categories**: Traditional supervised learning methods can only classify known categories and cannot handle new, unseen categories. By introducing zero-shot learning techniques, pre-trained models can be used to predict both known and unknown categories, thereby enhancing the model's generalization ability. 3. **Validating the effectiveness of the GPT model**: The research focuses on validating the effectiveness of the GPT model in different text classification scenarios and achieving zero-shot classification through the design of specific prompt strategies. Additionally, the paper compares the performance of zero-shot LLMs with traditional machine learning methods, deep learning methods, and other zero-shot methods across multiple datasets. Experimental results show that in the four analyzed datasets, the GPT model demonstrates better zero-shot classification performance on three datasets. This is particularly valuable for small businesses and teams, as these models simplify the text classification process, avoiding complex feature extraction and classifier training steps, and thus have high practical value.

Large Language Models Are Zero-Shot Text Classifiers

Large Language Models are Zero-Shot Reasoners

Adaptable and Reliable Text Classification using Large Language Models

LLM-powered Zero-shot Online Log Parsing

Large Language Models as Data Preprocessors

Large Language Models are Good Prompt Learners for Low-Shot Image Classification

Spoken Language Intelligence of Large Language Models for Language Learning

Large Language Models are Zero Shot Hypothesis Proposers

Generation-driven Contrastive Self-training for Zero-shot Text Classification with Instruction-following LLM

Generating Training Data with Language Models: Towards Zero-Shot Language Understanding

Zero-Shot Question Answering over Financial Documents using Large Language Models

Large Language Models Are Zero-Shot Rankers for Recommender Systems

Beat LLMs at Their Own Game: Zero-Shot LLM-Generated Text Detection Via Querying ChatGPT.

Empirical Study of Zero-Shot NER with ChatGPT

Pre-trained Language Models can be Fully Zero-Shot Learners

Zero-Shot Chain-of-Thought Reasoning Guided by Evolutionary Algorithms in Large Language Models

Can Large Language Models Grasp Event Signals? Exploring Pure Zero-Shot Event-based Recognition

Zero-Shot Next-Item Recommendation using Large Pretrained Language Models

Enhancing Text-based Knowledge Graph Completion with Zero-Shot Large Language Models: A Focus on Semantic Enhancement

What Do You See? Enhancing Zero-Shot Image Classification with Multimodal Large Language Models

Large Language Models are Strong Zero-Shot Retriever