The Life Cycle of Large Language Models: A Review of Biases in Education

Jinsook Lee,Yann Hicke,Renzhe Yu,Christopher Brooks,René F. Kizilcec
2024-06-04
Abstract:Large Language Models (LLMs) are increasingly adopted in educational contexts to provide personalized support to students and teachers. The unprecedented capacity of LLM-based applications to understand and generate natural language can potentially improve instructional effectiveness and learning outcomes, but the integration of LLMs in education technology has renewed concerns over algorithmic bias which may exacerbate educational inequities. In this review, building on prior work on mapping the traditional machine learning life cycle, we provide a holistic map of the LLM life cycle from the initial development of LLMs to customizing pre-trained models for various applications in educational settings. We explain each step in the LLM life cycle and identify potential sources of bias that may arise in the context of education. We discuss why current measures of bias from traditional machine learning fail to transfer to LLM-generated content in education, such as tutoring conversations because the text is high-dimensional, there can be multiple correct responses, and tailoring responses may be pedagogically desirable rather than unfair. This review aims to clarify the complex nature of bias in LLM applications and provide practical guidance for their evaluation to promote educational equity.
Computers and Society,Computation and Language
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This review article aims to explore the potential bias issues of large language models (LLMs) in educational applications and provide a comprehensive framework to understand the sources of these biases, evaluation methods, and mitigation strategies. Specifically, the paper mainly focuses on the following aspects: 1. **Identifying potential biases**: - The application of LLMs in education may inadvertently exacerbate existing educational inequalities. For example, algorithmic biases may have a negative impact on vulnerable groups, leading to unfair results. - The article discusses in detail various types of biases that may occur at different stages of the LLM life cycle, including representational bias and allocative bias. 2. **Constructing an LLM life - cycle model**: - Based on the traditional machine - learning life - cycle framework, the article proposes a new LLM life - cycle model that covers all steps from initial development to final deployment. - Each step may introduce biases, and the article analyzes these steps and their potential sources of bias in detail. 3. **Methods for evaluating and mitigating biases**: - The article discusses why the current bias - measurement methods used to evaluate traditional machine - learning models cannot be directly applied to the educational applications of LLMs. - It proposes specific evaluation methods tailored to the characteristics of LLMs and explores how to mitigate these biases through technical means and social measures. 4. **Promoting educational equity**: - The article emphasizes the importance of systematically evaluating biases in LLMs to avoid inadvertently magnifying existing inequalities in educational opportunities and achievements. - It provides practical guidance for researchers, practitioners, and policymakers to help them better understand and address the ethical issues of LLMs in education. ### Formula presentation When discussing bias evaluation, the article mentions some specific measurement methods. Here are several key formulas: - **Cross - Entropy Loss**: \[ L = -\sum_{i = 1}^{N}y_i\log(p_i) \] where \(y_i\) is the true label and \(p_i\) is the predicted probability. - **Word Embedding Association Test (WEAT)**: \[ s(X, Y; A, B)=\frac{\sum_{x\in X}(\text{mean}_{b\in B}\cos(x, b)-\text{mean}_{a\in A}\cos(x, a))}{\text{stddev}_{w\in X\cup Y}(\text{mean}_{b\in B}\cos(w, b)-\text{mean}_{a\in A}\cos(w, a))} \] where \(X\) and \(Y\) are two sets of words, and \(A\) and \(B\) are sets of attribute words. - **Log - Probability Bias Score (LPBS)**: \[ \text{LPBS}(G_1, G_2)=\frac{1}{|T|}\sum_{t\in T}\log\left(\frac{p(G_1|t)}{p(G_2|t)}\right) \] where \(G_1\) and \(G_2\) are two social groups, and \(T\) is a set of template sentences. Through these formulas and detailed life - cycle analysis, the article hopes to provide a theoretical basis and practical guidance for the safe and fair application of LLMs in the education field.