Salvador Robles Herrera,Verya Monjezi,Vladik Kreinovich,Ashutosh Trivedi,Saeid Tizpaz-Niari
Abstract:This paper investigates the relationships between hyperparameters of machine learning and fairness. Data-driven solutions are increasingly used in critical socio-technical applications where ensuring fairness is important. Rather than explicitly encoding decision logic via control and data structures, the ML developers provide input data, perform some pre-processing, choose ML algorithms, and tune hyperparameters (HPs) to infer a program that encodes the decision logic. Prior works report that the selection of HPs can significantly influence fairness. However, tuning HPs to find an ideal trade-off between accuracy, precision, and fairness has remained an expensive and tedious task. Can we predict fairness of HP configuration for a given dataset? Are the predictions robust to distribution shifts?
We focus on group fairness notions and investigate the HP space of 5 training algorithms. We first find that tree regressors and XGBoots significantly outperformed deep neural networks and support vector machines in accurately predicting the fairness of HPs. When predicting the fairness of ML hyperparameters under temporal distribution shift, the tree regressors outperforms the other algorithms with reasonable accuracy. However, the precision depends on the ML training algorithm, dataset, and protected attributes. For example, the tree regressor model was robust for training data shift from 2014 to 2018 on logistic regression and discriminant analysis HPs with sex as the protected attribute; but not for race and other training algorithms. Our method provides a sound framework to efficiently perform fine-tuning of ML training algorithms and understand the relationships between HPs and fairness.
Software Engineering,Artificial Intelligence,Computers and Society,Machine Learning
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is: **Can the fairness of machine learning (ML) hyper - parameter configurations be predicted, and is this prediction still robust when the data distribution changes?** Specifically, the paper focuses on:
1. **Can the group fairness of ML hyper - parameters be accurately predicted given the training data set and protected attributes?**
2. **How do different types of prediction methods (such as neural networks, support vector regression, tree regressors, and XGBoost) perform in predicting the fairness of ML hyper - parameters?**
3. **Can the fairness of ML hyper - parameters be predicted under changes in time distribution? Which types of ML algorithms are more robust under such changes?**
### Detailed Interpretation
#### Problem Background
With the widespread use of data - driven solutions in critical socio - technical applications, ensuring the fairness of these systems has become crucial. Machine learning developers build models by providing input data, selecting algorithms, and adjusting hyper - parameters. However, the selection of hyper - parameters has a significant impact on the fairness of the model, and finding the ideal combination of hyper - parameters to balance accuracy, precision, and fairness is an expensive and cumbersome task.
#### Research Objectives
The goals of the paper are:
- To explore the relationship between machine learning hyper - parameters and fairness.
- To use regression methods to learn this relationship, thereby predicting the fairness of specific hyper - parameter configurations and avoiding a complete training cycle.
- To analyze the robustness of these prediction models under changes in data distribution (especially changes in time distribution).
#### Experimental Design
To answer the above questions, the authors conducted the following experiments:
- **Data Sets**: Four socially critical data sets (Adult Census, Compas Recidivism, Default Credit, Bank Marketing) were used, covering different protected attributes (such as gender, race, etc.).
- **ML Algorithms**: Five popular machine learning algorithms (decision tree classifier, support vector machine, logistic regression classifier, random forest, and discriminant analysis) were selected.
- **Prediction Methods**: Four regression methods (deep neural network, support vector regression, tree regressor, and XGBoost) were used to learn the mapping from hyper - parameters to fairness.
#### Main Findings
- **Performance on Fixed Data Sets**: Tree regressors and XGBoost performed well in predicting the AOD fairness of all five algorithms, achieving \( R^2\geq0.95 \) in 40% of cases and only having \( R^2\leq0.5 \) in 6.7% of cases.
- **Performance under Changes in Time Distribution**: Tree regressors and XGBoost performed relatively well in the one - year time - distribution change, but the accuracy decreased significantly for other training algorithms and protected attributes (such as race).
#### Conclusions
The paper provides a systematic framework for efficiently adjusting ML training algorithms and understanding the relationship between hyper - parameters and fairness. Although there are challenges in prediction in some cases, this study provides valuable insights for reducing bias configurations in data - driven software development and points out directions for future research.
### Formula Summary
The formulas involved in the paper include:
- **Average Odds Difference (AOD)**:
\[
AOD_M=\frac{|TPR_M(0)-TPR_M(1)|+|FPR_M(0)-FPR_M(1)|}{2}
\]
- **Mean Squared Error (MSE)**:
\[
MSE = \frac{1}{n}\sum_{i = 1}^{n}(AOD_i-\hat{AOD}_i)^2
\]
- **Coefficient of Determination (\( R^2 \))**:
\[
R^2=1-\frac{\sum_{i = 1}^{n}(AOD_i-\hat{AOD}_i)^2}{\sum_{i = 1}^{n}(AOD_i-\bar{AOD})^2}
\]
These formulas are used to evaluate the performance and fairness indicators of prediction models.