Bias Unveiled: Investigating Social Bias in LLM-Generated Code

Lin Ling,Fazle Rabbi,Song Wang,Jinqiu Yang
2024-11-16
Abstract:Large language models (LLMs) have significantly advanced the field of automated code generation. However, a notable research gap exists in the evaluation of social biases that may be present in the code produced by LLMs. To solve this issue, we propose a novel fairness framework, i.e., Solar, to assess and mitigate the social biases of LLM-generated code. Specifically, Solar can automatically generate test cases for quantitatively uncovering social biases of the auto-generated code by LLMs. To quantify the severity of social biases in generated code, we develop a dataset that covers a diverse set of social problems. We applied Solar and the crafted dataset to four state-of-the-art LLMs for code generation. Our evaluation reveals severe bias in the LLM-generated code from all the subject LLMs. Furthermore, we explore several strategies for bias mitigation, including Chain-of-Thought (CoT) prompting, combining positive role-playing with CoT prompting and iterative prompting. Our experiments show that iterative prompting can effectively reduce social bias in LLM-generated code by up to 90%. Solar is highly extensible to evaluate new social problems.
Software Engineering
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to evaluate and mitigate social biases in the code generated by large language models (LLMs). Specifically, the researchers are concerned about the potential unfair treatment or discrimination against different demographic groups in the automatically generated code. ### Problem Background In recent years, large language models (LLMs) such as Codex, CodeGen, and StarCoder have made remarkable progress in the field of automated code generation. However, an important research gap lies in whether the code generated by these models contains social biases. Current evaluation methods such as HumanEval and MBPP mainly focus on functional correctness and overlook fairness, especially detecting potential biases in the code against diverse population groups. ### Research Objectives To fill this research gap, the authors propose a new fairness evaluation framework called Solar and develop a dataset covering various social issues to quantify and mitigate social biases in LLM - generated code. Specific objectives include: 1. **Evaluating Social Biases**: Through automated test case generation, quantitatively reveal social biases in LLM - generated code. 2. **Developing a Dataset**: Construct a dataset named SocialBias - Bench containing 343 social issues, covering seven categories such as social welfare access, university admission eligibility, etc. 3. **Exploring Mitigation Strategies**: Investigate different prompting strategies, such as Chain - of - Thought (CoT) prompting, combining positive role - playing with CoT prompting, and iterative prompting, to reduce social biases in the code. ### Main Contributions 1. **Scalable Evaluation Dataset**: SocialBias - Bench contains multiple real - world social issues for evaluating social biases in LLM - generated code. 2. **Fairness Evaluation Framework Solar**: Based on the concept of metamorphic testing, Solar can quantify the fairness of LLM - generated code and is applicable to LLMs of any architecture. 3. **Ablation Study**: Explore the impact of temperature and judgmental words on fairness evaluation. 4. **Mitigation Strategy Exploration**: Experiments show that iterative prompting can effectively reduce social biases in the code by up to 90%. ### Method Overview - **Task Definition and Code Prompt Generation**: Solar automatically generates code prompts and executable test cases according to the task definition. - **Testing Code Biases**: By changing sensitive attributes (such as gender, age, etc.), check whether the behavior of the generated code is consistent. - **Mitigation Strategies**: Adjust the prompts through a feedback mechanism to gradually reduce biases in the code. ### Experimental Results The authors conducted experiments on four state - of - the - art LLMs. The results show that all models have different degrees of social biases when generating code. In particular, GPT - 3.5 - turbo - 0125 exhibits the highest overall bias score (CBS), reaching 60.58%, and the biases in terms of age, gender, and employment status are particularly severe. In addition, the iterative prompting strategy has been proven to significantly reduce biases while maintaining the functional correctness of the code. In conclusion, this paper aims to reveal and solve the problem of social biases in LLM - generated code, providing important tools and methods for future research and applications.