Abstract:We present a method for quantitative, in-depth analyses of fairness issues in AI systems with an application to credit scoring. To this aim we use BRIO, a tool for the evaluation of AI systems with respect to social unfairness and, more in general, ethically undesirable behaviours. It features a model-agnostic bias detection module, presented in \cite{DBLP:conf/beware/CoragliaDGGPPQ23}, to which a full-fledged unfairness risk evaluation module is added. As a case study, we focus on the context of credit scoring, analysing the UCI German Credit Dataset \cite{misc_statlog_(german_credit_data)_144}. We apply the BRIO fairness metrics to several, socially sensitive attributes featured in the German Credit Dataset, quantifying fairness across various demographic segments, with the aim of identifying potential sources of bias and discrimination in a credit scoring model. We conclude by combining our results with a revenue analysis.

What problem does this paper attempt to address?

### Problems the Paper Attempts to Solve The paper aims to address the issue of fairness in credit scoring. Specifically, the authors propose a method to quantitatively and deeply analyze the fairness issues of artificial intelligence (AI) systems in credit scoring. They use a tool called BRIO, which can evaluate the performance of AI systems in terms of socially undesirable and unethical behaviors. ### Main Research Background In recent years, the application of artificial intelligence (AI) in various fields has brought transformative changes, especially in areas involving decision-making processes. Credit scoring is a particularly important field where traditional credit scoring algorithms play a key role in determining individual creditworthiness, affecting finance, housing, employment opportunities, etc. The application of AI in credit scoring is expected to improve accuracy and efficiency, but its inherent opacity poses challenges to ensuring fairness, particularly regarding biases that may exacerbate or perpetuate social inequalities. ### Research Objectives 1. **Quantify Fairness**: Use the BRIO tool to quantify fairness in credit scoring models across multiple socially sensitive attributes, identifying potential sources of bias and discrimination. 2. **Risk Assessment**: Combine income analysis to assess the fairness risk of credit scoring models among different demographic groups. 3. **Method Validation**: Apply the BRIO tool on the UCI German credit dataset to demonstrate its effectiveness and practicality in analyzing fairness in credit scoring. ### Research Methods 1. **Dataset Selection**: Use the UCI German credit dataset, which contains 1000 instances, each with 20 input variables and a binary label (indicating default or not). 2. **Sensitive Attribute Selection**: Select gender, age group, and foreign worker status as sensitive attributes. 3. **Model Construction**: Use the Optibinning library to construct a machine learning model, including variable binning and scorecard generation. 4. **Fairness Analysis**: Use the BRIO tool for fairness detection and risk assessment, comparing behaviors across different sensitive categories and calculating various bias metrics. ### Main Findings 1. **Gender Differences**: Males perform better than females in terms of default risk. 2. **Age Differences**: Older groups perform better than younger groups in terms of default risk. 3. **Nationality Differences**: Native workers perform better than foreign workers in terms of default risk. ### Conclusion Through the analysis with the BRIO tool, the authors were able to identify potential sources of bias and discrimination in the credit scoring model and propose some mitigation strategies. These findings are significant for ensuring the fairness of credit scoring systems and promoting financial inclusion.

Evaluating AI fairness in credit scoring with the BRIO tool

Standardizing fairness-evaluation procedures: interdisciplinary insights on machine learning algorithms in creditworthiness assessments for small personal loans

Responsible AI in automated credit scoring systems

Towards Responsible AI: A Design Space Exploration of Human-Centered Artificial Intelligence User Interfaces to Investigate Fairness

Towards Responsible AI in Banking: Addressing Bias for Fair Decision-Making

Fairness in Credit Scoring: Assessment, Implementation and Profit Implications

A Distributionally Robust Optimisation Approach to Fair Credit Scoring

Fairness Assessment for Artificial Intelligence in Financial Industry

Identifying, measuring, and mitigating individual unfairness for supervised learning models and application to credit risk models

Algorithmic fairness in credit scoring

Human-in-the-loop Fairness: Integrating Stakeholder Feedback to Incorporate Fairness Perspectives in Responsible AI

The Fairness of Credit Scoring Models

How fair is machine learning in credit lending?

Best Practices for Responsible Machine Learning in Credit Scoring

SAFE Artificial Intelligence in Finance

EARN Fairness: Explaining, Asking, Reviewing and Negotiating Artificial Intelligence Fairness Metrics Among Stakeholders

Facing the Challenges of Developing Fair Risk Scoring Models

Towards Involving End-users in Interactive Human-in-the-loop AI Fairness

Preliminary Insights on Industry Practices for Addressing Fairness Debt

Baseline validation of a bias-mitigated loan screening model based on the European Banking Authority's trust elements of Big Data & Advanced Analytics applications using Artificial Intelligence

Influence of Artificial Intelligence on Credit Risk Assessment in Banking Sector