Best Practices for Text Annotation with Large Language Models

Petter Törnberg
2024-02-05
Abstract:Large Language Models (LLMs) have ushered in a new era of text annotation, as their ease-of-use, high accuracy, and relatively low costs have meant that their use has exploded in recent months. However, the rapid growth of the field has meant that LLM-based annotation has become something of an academic Wild West: the lack of established practices and standards has led to concerns about the quality and validity of research. Researchers have warned that the ostensible simplicity of LLMs can be misleading, as they are prone to bias, misunderstandings, and unreliable results. Recognizing the transformative potential of LLMs, this paper proposes a comprehensive set of standards and best practices for their reliable, reproducible, and ethical use. These guidelines span critical areas such as model selection, prompt engineering, structured prompting, prompt stability analysis, rigorous model validation, and the consideration of ethical and legal implications. The paper emphasizes the need for a structured, directed, and formalized approach to using LLMs, aiming to ensure the integrity and robustness of text annotation practices, and advocates for a nuanced and critical engagement with LLMs in social scientific research.
Computer Science
What problem does this paper attempt to address?
This paper addresses the lack of standards and best practices when using large language models (LLMs) for text annotation. With the widespread use of LLMs in text analysis, concerns about research quality, bias, and reliability arise due to the lack of uniform guidelines. The paper proposes a comprehensive set of standards and best practices, including model selection, prompt engineering, model validation, etc., aiming to ensure the reliability, reproducibility, and ethical use of LLMs in social science research. The paper emphasizes the need for structured, directed, and formal approaches to utilizing LLMs and advocates for a nuanced and critical understanding of LLMs.