Abstract:This guide introduces Large Language Models (LLM) as a highly versatile text analysis method within the social sciences. As LLMs are easy-to-use, cheap, fast, and applicable on a broad range of text analysis tasks, ranging from text annotation and classification to sentiment analysis and critical discourse analysis, many scholars believe that LLMs will transform how we do text analysis. This how-to guide is aimed at students and researchers with limited programming experience, and offers a simple introduction to how LLMs can be used for text analysis in your own research project, as well as advice on best practices. We will go through each of the steps of analyzing textual data with LLMs using Python: installing the software, setting up the API, loading the data, developing an analysis prompt, analyzing the text, and validating the results. As an illustrative example, we will use the challenging task of identifying populism in political texts, and show how LLMs move beyond the existing state-of-the-art.

What problem does this paper attempt to address?

The paper primarily explores the application of Large Language Models (LLM) in the analysis of social science texts, aiming to address the limitations of traditional text analysis methods, such as the need for deep expertise, extensive manual coding of training data, and inadequacies in handling sarcasm and contextual understanding. Specifically, the paper attempts to address the following key issues: 1. **Simplify Text Analysis**: LLMs are easy to use, cost-effective, and fast, suitable for a wide range of text analysis tasks, including text annotation, classification, sentiment analysis, and critical discourse analysis, etc. This enables students and researchers without programming experience to conduct text analysis. 2. **Improve Analysis Accuracy**: Traditional natural language processing and machine learning methods often have limited accuracy when dealing with complex language phenomena (such as sarcasm, context-dependent interpretations). LLMs demonstrate the ability to transcend these limitations, capable of performing almost any text analysis task, and in some cases, outperform human experts. 3. **Standardization and Reproducibility**: LLMs provide standardized and reproducible methods for text analysis, which helps to reduce biases in manual analysis, enhance research rigor and data quality, especially in large-scale text analyses. 4. **Cross-Domain Applicability**: The paper points out that LLMs are not only suitable for specific tasks but can also adapt to different types of text analysis challenges without the need for retraining, such as identifying populist tendencies in political texts. 5. **Challenging the Quantitative-Qualitative Analysis Boundaries**: By making new analysis tasks possible, LLMs blur the traditional boundaries between quantitative and qualitative research fields, promoting the integration of analytical methods in social sciences. The paper illustrates how to use LLM for text analysis with a concrete example—measuring populism in political texts—demonstrating how this technology can solve long-standing issues with quantifying complex concepts. Additionally, the paper discusses the limitations and potential biases to consider when using LLMs for text analysis, emphasizing the importance of validating results and ethical considerations. In summary, the paper provides a practical guide, instructing readers on how to utilize LLMs for efficient and accurate text analysis in their own research projects.

How to use LLMs for Text Analysis

Best Practices for Text Annotation with Large Language Models

An Examination of the Use of Large Language Models to Aid Analysis of Textual Data

Open-Source LLMs for Text Annotation: A Practical Guide for Model Setting and Fine-Tuning

LLMs for science: Usage for code generation and data analysis

Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond

Can Large Language Models Transform Computational Social Science?

Large Language Models for Conducting Advanced Text Analytics Information Systems Research

Prompt Refinement or Fine-tuning? Best Practices for using LLMs in Computational Social Science Tasks

Automating Thematic Analysis: How LLMs Analyse Controversial Topics

Apprentices to Research Assistants: Advancing Research with Large Language Models

Large Language Models: An Applied Econometric Framework

How LLMs Aid in UML Modeling: An Exploratory Study with Novice Analysts

The LLM Effect: Are Humans Truly Using LLMs, or Are They Being Influenced By Them Instead?

LLM-in-the-loop: Leveraging Large Language Model for Thematic Analysis

Large language models and academic writing: Five tiers of engagement

Large Language Models and Thematic Analysis: Human-AI Synergy in Researching Hate Speech on Social Media

LLMs as Research Tools: A Large Scale Survey of Researchers' Usage and Perceptions