Abstract:The capacity of LLMs to carry out automated qualitative analysis has been questioned by corpus linguists, and it has been argued that corpus-based discourse analysis incorporating LLMs is hindered by issues of unsatisfying performance, hallucination, and irreproducibility. Our proposed method, TACOMORE, aims to address these concerns by serving as an effective prompting framework in this domain. The framework consists of four principles, i.e., Task, Context, Model and Reproducibility, and specifies five fundamental elements of a good prompt, i.e., Role Description, Task Definition, Task Procedures, Contextual Information and Output Format. We conduct experiments on three LLMs, i.e., GPT-4o, Gemini-1.5-Pro and <a class="link-external link-http" href="http://Gemini-1.5.Flash" rel="external noopener nofollow">this http URL</a>, and find that TACOMORE helps improve LLM performance in three representative discourse analysis tasks, i.e., the analysis of keywords, collocates and concordances, based on an open corpus of COVID-19 research articles. Our findings show the efficacy of the proposed prompting framework TACOMORE in corpus-based discourse analysis in terms of Accuracy, Ethicality, Reasoning, and Reproducibility, and provide novel insights into the application and evaluation of LLMs in automated qualitative studies.

What problem does this paper attempt to address?

The problems that this paper attempts to solve are as follows: Large Language Models (LLMs) do not perform satisfactorily in corpus discourse analysis, with issues such as unsatisfactory performance, hallucinations (i.e., generating inaccurate or fictional information), and non - reproducibility. These problems impede the application and popularization of LLMs in corpus discourse analysis. Specifically, the paper points out: 1. **Performance issues**: When dealing with complex corpus discourse analysis tasks, the accuracy and logical reasoning ability of LLMs have not yet reached the level of human experts. 2. **Hallucination issues**: LLMs sometimes generate information that does not conform to the facts or is fictional, which is a serious problem in academic research that requires a high degree of accuracy. 3. **Non - reproducibility**: Since the output of LLMs may vary due to different environments, hardware, or operators, it is difficult to repeat and verify the experimental results. To solve these problems, the author proposes a prompt framework named TACOMORE. TACOMORE aims to optimize the performance of LLMs in corpus discourse analysis through four principles (Task, Context, Model, Reproducibility) and five basic elements (role description, task definition, task steps, context information, output format). ### Specific improvement measures - **Task refinement**: Decompose complex tasks into specific steps to ensure that LLMs can gradually understand and execute tasks. - **Provide context**: Provide necessary context information for LLMs so that they can better understand the task background and specific content. - **Select an appropriate model**: Select an LLM model suitable for processing a large amount of input data according to task requirements. - **Ensure reproducibility**: Minimize the uncertainty of LLMs' output and improve the stability and reproducibility of results through standardized prompt structures and evaluation methods. ### Experimental verification The author conducted experiments on three representative discourse analysis tasks (keyword analysis, collocation analysis, co - occurrence analysis), using three LLM models: GPT - 4o, Gemini - 1.5 - Pro, and Gemini - 1.5 - Flash. The experimental results show that the TACOMORE framework significantly improves the performance of LLMs in these tasks, especially in terms of accuracy, ethics, reasoning ability, and reproducibility. In conclusion, this paper effectively solves the key problems of LLMs in corpus discourse analysis by proposing the TACOMORE framework, providing new ideas and methods for future research.

TACOMORE: Leveraging the Potential of LLMs in Corpus-based Discourse Analysis with Prompt Engineering

AlignedCoT: Prompting Large Language Models via Native-Speaking Demonstrations

Evoke: Evoking Critical Thinking Abilities in LLMs via Reviewer-Author Prompt Editing

Leveraging Large Language Models for Automating Inductive Qualitative Coding: A Comparative Study of Prompt Engineering Techniques

OverleafCopilot: Empowering Academic Writing in Overleaf with Large Language Models

Probing the Capacity of Language Model Agents to Operationalize Disparate Experiential Context Despite Distraction

Active Prompting with Chain-of-Thought for Large Language Models

Towards A Human-in-the-Loop LLM Approach to Collaborative Discourse Analysis

Adapting LLMs for Efficient Context Processing through Soft Prompt Compression

Prompt Recursive Search: A Living Framework with Adaptive Growth in LLM Auto-Prompting

LLM-in-the-loop: Leveraging Large Language Model for Thematic Analysis

Comparative Analysis of Prompt Strategies for Large Language Models: Single-Task vs. Multitask Prompts

Assessing the potential of LLM-assisted annotation for corpus-based pragmatics and discourse analysis: The case of apology

Prompt engineering in consistency and reliability with the evidence-based guideline for LLMs

PromptAid: Prompt Exploration, Perturbation, Testing and Iteration using Visual Analytics for Large Language Models

LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression

ChainLM: Empowering Large Language Models with Improved Chain-of-Thought Prompting

Unveiling the Lexical Sensitivity of LLMs: Combinatorial Optimization for Prompt Enhancement

R$^3$ Prompting: Review, Rephrase and Resolve for Chain-of-Thought Reasoning in Large Language Models under Noisy Context