LLM-in-the-loop: Leveraging Large Language Model for Thematic Analysis

Shih-Chieh Dai,Aiping Xiong,Lun-Wei Ku

2023-10-24

Abstract:Thematic analysis (TA) has been widely used for analyzing qualitative data in many disciplines and fields. To ensure reliable analysis, the same piece of data is typically assigned to at least two human coders. Moreover, to produce meaningful and useful analysis, human coders develop and deepen their data interpretation and coding over multiple iterations, making TA labor-intensive and time-consuming. Recently the emerging field of large language models (LLMs) research has shown that LLMs have the potential replicate human-like behavior in various tasks: in particular, LLMs outperform crowd workers on text-annotation tasks, suggesting an opportunity to leverage LLMs on TA. We propose a human-LLM collaboration framework (i.e., LLM-in-the-loop) to conduct TA with in-context learning (ICL). This framework provides the prompt to frame discussions with a LLM (e.g., GPT-3.5) to generate the final codebook for TA. We demonstrate the utility of this framework using survey datasets on the aspects of the music listening experience and the usage of a password manager. Results of the two case studies show that the proposed framework yields similar coding quality to that of human coders but reduces TA's labor and time demands.

Computation and Language

What problem does this paper attempt to address?

The paper primarily explores how to leverage Large Language Models (LLM) to enhance the efficiency and effectiveness of Thematic Analysis (TA). Specifically, the paper attempts to address the following key issues: 1. **Improving TA Efficiency**: Traditional thematic analysis methods typically require at least 2 human coders with relevant expertise to participate in the entire process, which is both time-consuming and labor-intensive. The paper proposes a human-LLM collaboration framework (LLM-in-the-loop) aimed at reducing the human and time resources needed for TA. 2. **Ensuring Analysis Quality**: To ensure the reliability of the analysis, traditionally the same data is assigned to at least 2 coders for independent coding. By introducing LLM as a Machine Coder (MC) and collaborating with a Human Coder (HC), the paper explores whether this collaborative model can maintain coding quality while reducing human resource requirements. 3. **Addressing LLM Input Limitations**: Considering the input size limitations of LLMs when processing long texts, the paper proposes a solution of using only a portion of the data to generate the codebook to tackle this challenge. In summary, the main objective of this study is to verify whether a more efficient thematic analysis method can be designed by combining human and LLM capabilities, while maintaining or even improving the quality of the analysis. Through two case studies—Music Shuffle and Password Manager usage surveys—the paper demonstrates that the proposed framework can effectively reduce the time and labor costs of TA while achieving a quality of work comparable to that of 2 human coders.

LLM-in-the-loop: Leveraging Large Language Model for Thematic Analysis

LLM-in-the-loop: Leveraging Large Language Model for Thematic Analysis

Using Large Language Models to Support Thematic Analysis in Empirical Legal Studies

Performing an Inductive Thematic Analysis of Semi-Structured Interviews With a Large Language Model: An Exploration and Provocation on the Limits of the Approach

Can Large Language Models emulate an inductive Thematic Analysis of semi-structured interviews? An exploration and provocation on the limits of the approach and the model

An Examination of the Use of Large Language Models to Aid Analysis of Textual Data

Automating Thematic Analysis: How LLMs Analyse Controversial Topics

Large Language Models and Thematic Analysis: Human-AI Synergy in Researching Hate Speech on Social Media

LLM Reading Tea Leaves: Automatically Evaluating Topic Models with Large Language Models

Inductive thematic analysis of healthcare qualitative interviews using open-source large language models: How does it compare to traditional methods?

LLM-Assisted Content Analysis: Using Large Language Models to Support Deductive Coding

Apprentices to Research Assistants: Advancing Research with Large Language Models

Large Language Models for Code Analysis: Do LLMs Really Do Their Job?

The LLM Effect: Are Humans Truly Using LLMs, or Are They Being Influenced By Them Instead?

Neural Topic Modeling with Large Language Models in the Loop

Understanding the Human-LLM Dynamic: A Literature Survey of LLM Use in Programming Tasks

From Voices to Validity: Leveraging Large Language Models (LLMs) for Textual Analysis of Policy Stakeholder Interviews

Human-in-the-loop Machine Translation with Large Language Model

Reflections on Inductive Thematic Saturation as a potential metric for measuring the validity of an inductive Thematic Analysis with LLMs

Large Language Models Offer an Alternative to the Traditional Approach of Topic Modelling