Evaluating AI fairness in credit scoring with the BRIO tool

Greta Coraglia,Francesco A. Genco,Pellegrino Piantadosi,Enrico Bagli,Pietro Giuffrida,Davide Posillipo,Giuseppe Primiero
2024-06-05
Abstract:We present a method for quantitative, in-depth analyses of fairness issues in AI systems with an application to credit scoring. To this aim we use BRIO, a tool for the evaluation of AI systems with respect to social unfairness and, more in general, ethically undesirable behaviours. It features a model-agnostic bias detection module, presented in \cite{DBLP:conf/beware/CoragliaDGGPPQ23}, to which a full-fledged unfairness risk evaluation module is added. As a case study, we focus on the context of credit scoring, analysing the UCI German Credit Dataset \cite{misc_statlog_(german_credit_data)_144}. We apply the BRIO fairness metrics to several, socially sensitive attributes featured in the German Credit Dataset, quantifying fairness across various demographic segments, with the aim of identifying potential sources of bias and discrimination in a credit scoring model. We conclude by combining our results with a revenue analysis.
Artificial Intelligence
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve The paper aims to address the issue of fairness in credit scoring. Specifically, the authors propose a method to quantitatively and deeply analyze the fairness issues of artificial intelligence (AI) systems in credit scoring. They use a tool called BRIO, which can evaluate the performance of AI systems in terms of socially undesirable and unethical behaviors. ### Main Research Background In recent years, the application of artificial intelligence (AI) in various fields has brought transformative changes, especially in areas involving decision-making processes. Credit scoring is a particularly important field where traditional credit scoring algorithms play a key role in determining individual creditworthiness, affecting finance, housing, employment opportunities, etc. The application of AI in credit scoring is expected to improve accuracy and efficiency, but its inherent opacity poses challenges to ensuring fairness, particularly regarding biases that may exacerbate or perpetuate social inequalities. ### Research Objectives 1. **Quantify Fairness**: Use the BRIO tool to quantify fairness in credit scoring models across multiple socially sensitive attributes, identifying potential sources of bias and discrimination. 2. **Risk Assessment**: Combine income analysis to assess the fairness risk of credit scoring models among different demographic groups. 3. **Method Validation**: Apply the BRIO tool on the UCI German credit dataset to demonstrate its effectiveness and practicality in analyzing fairness in credit scoring. ### Research Methods 1. **Dataset Selection**: Use the UCI German credit dataset, which contains 1000 instances, each with 20 input variables and a binary label (indicating default or not). 2. **Sensitive Attribute Selection**: Select gender, age group, and foreign worker status as sensitive attributes. 3. **Model Construction**: Use the Optibinning library to construct a machine learning model, including variable binning and scorecard generation. 4. **Fairness Analysis**: Use the BRIO tool for fairness detection and risk assessment, comparing behaviors across different sensitive categories and calculating various bias metrics. ### Main Findings 1. **Gender Differences**: Males perform better than females in terms of default risk. 2. **Age Differences**: Older groups perform better than younger groups in terms of default risk. 3. **Nationality Differences**: Native workers perform better than foreign workers in terms of default risk. ### Conclusion Through the analysis with the BRIO tool, the authors were able to identify potential sources of bias and discrimination in the credit scoring model and propose some mitigation strategies. These findings are significant for ensuring the fairness of credit scoring systems and promoting financial inclusion.