Abstract:Fairness in both machine learning (ML) predictions and human decisions is critical, with ML models prone to algorithmic and data bias, and human decisions affected by subjectivity and cognitive bias. This study investigates fairness using a real-world university admission dataset with 870 profiles, leveraging three ML models, namely XGB, Bi-LSTM, and KNN. Textual features are encoded with BERT embeddings. For individual fairness, we assess decision consistency among experts with varied backgrounds and ML models, using a consistency score. Results show ML models outperform humans in fairness by 14.08% to 18.79%. For group fairness, we propose a gender-debiasing pipeline and demonstrate its efficacy in removing gender-specific language without compromising prediction performance. Post-debiasing, all models maintain or improve their classification accuracy, validating the hypothesis that fairness and performance can coexist. Our findings highlight ML's potential to enhance fairness in admissions while maintaining high accuracy, advocating a hybrid approach combining human judgement and ML models.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is the fairness issue in machine learning (ML) models and human decision - making. Specifically: 1. **Individual Fairness**: The paper explores individual fairness by evaluating the consistency of human experts from different backgrounds and ML models in university admission decisions. The research uses a consistency score to quantify whether applicants with similar backgrounds are treated similarly. The results show that ML models are significantly better than human experts in terms of individual fairness, with the consistency score increasing by 14.08% to 18.79%. 2. **Group Fairness**: The paper also explores how to eliminate gender bias through data de - biasing while maintaining or improving prediction performance. The research proposes a gender - de - biasing pipeline and verifies its effectiveness through experiments. The experimental results show that the de - biased model maintains or improves its classification accuracy, verifying the hypothesis that fairness and performance can co - exist. ### Main Contributions 1. **Propose and Verify an Effective De - biasing Pipeline**: The paper proposes a robust pipeline for identifying and mitigating biases in machine learning models and verifies its effectiveness through statistical methods. 2. **Experimentally Prove that De - biasing Does Not Sacrifice Accuracy**: The experimental results show that data de - biasing not only does not reduce model performance but may also improve the accuracy of the model. 3. **Empirically Study Individual and Group Fairness**: The paper conducts a rigorous empirical study in a real - world university admission scenario, revealing that ML models are superior to human experts in terms of consistency and perform well in terms of fairness and accuracy. ### Research Methods - **Dataset**: A real - world university admission dataset containing 870 unique applicant profiles was used. - **Models**: Three ML models were adopted: XGBoost, Bi - LSTM, and KNN. - **Feature Encoding**: BERT was used to encode text features to generate high - dimensional embedding vectors. - **Evaluation Metrics**: In addition to traditional classification performance metrics (such as precision, recall, F1 - score, and accuracy), a consistency score was also used to evaluate individual fairness. ### Experimental Results - **Individual Fairness**: ML models are significantly better than human experts in terms of individual fairness, with the consistency score increasing by 14.08% to 18.79%. - **Group Fairness**: The de - biased model maintains or improves its classification accuracy, verifying the hypothesis that fairness and performance can co - exist. ### Conclusion The research results of the paper show that ML models have significant advantages in ensuring fairness, especially in terms of individual fairness and group fairness. These findings provide valuable references for decision - support tools in the university admission process, and it is recommended to adopt a hybrid method combining human judgment and ML models to optimize fairness and performance.

Fairness And Performance In Harmony: Data Debiasing Is All You Need

Simultaneous Improvement of ML Model Fairness and Performance by Identifying Bias in Data

Towards A Holistic View of Bias in Machine Learning: Bridging Algorithmic Fairness and Imbalanced Learning

On the Fairness of Machine-Assisted Human Decisions

Does Machine Bring in Extra Bias in Learning? Approximating Fairness in Models Promptly

A Comprehensive Empirical Study of Bias Mitigation Methods for Machine Learning Classifiers

An Empirical Comparison of Bias Reduction Methods on Real-World Problems in High-Stakes Policy Settings

Fairness for Deep Learning Predictions Using Bias Parity Score Based Loss Function Regularization

Normalise for Fairness: A Simple Normalisation Technique for Fairness in Regression Machine Learning Problems

AdapFair: Ensuring Continuous Fairness for Machine Learning Operations

Towards Fair Machine Learning Software: Understanding and Addressing Model Bias Through Counterfactual Thinking

Bias in Machine Learning Software: Why? How? What to do?

Aleatoric and Epistemic Discrimination: Fundamental Limits of Fairness Interventions

Editable Fairness: Fine-Grained Bias Mitigation in Language Models

Data vs. Model Machine Learning Fairness Testing: An Empirical Study

The Equity Framework: Fairness Beyond Equalized Predictive Outcomes

AIM: Attributing, Interpreting, Mitigating Data Unfairness

Fix Fairness, Don't Ruin Accuracy: Performance Aware Fairness Repair using AutoML

Fairness Measures of Machine Learning Models in Judicial Penalty Prediction

Does Debiasing Inevitably Degrade the Model Performance

Increasing Fairness in Predictions Using Bias Parity Score Based Loss Function Regularization