Automatic Text Classification With Large Language Models: A Review of openai for Zero- and Few-Shot Classification
Kylie L. Anglin,Claudia Ventura
DOI: https://doi.org/10.3102/10769986241279927
2024-12-01
Journal of Educational and Behavioral Statistics
Abstract:Journal of Educational and Behavioral Statistics, Ahead of Print. While natural language documents, such as intervention transcripts and participant writing samples, can provide highly nuanced insights into educational and psychological constructs, researchers often find these materials difficult and expensive to analyze. Recent developments in machine learning, however, have allowed social scientists to harness the power of artificial intelligence for complex data categorization tasks. One approach, supervised learning, supports high-performance categorization yet still requires a large, hand-labeled training corpus, which can be costly. An alternative approach—zero- and few-shot classification with pretrained large language models—offers a cheaper, compelling alternative. This article considers the application of zero-shot and few-shot classification in educational research. We provide an overview of large language models, a step-by-step tutorial on using the Python openai package for zero-shot and few-shot classification, and a discussion of relevant research considerations for social scientists.
education & educational research,psychology, mathematical,social sciences, mathematical methods