Abstract:The cause-to-effect analysis can help us decompose all the likely causes of a problem, such as an undesirable business situation or unintended harm to the individual(s). This implies that we can identify how the problems are inherited, rank the causes to help prioritize fixes, simplify a complex problem and visualize them. In the context of machine learning (ML), one can use cause-to-effect analysis to understand the reason for the biased behavior of the system. For example, we can examine the root causes of biases by checking each feature for a potential cause of bias in the model. To approach this, one can apply small changes to a given feature or a pair of features in the data, following some guidelines and observing how it impacts the decision made by the model (i.e., model prediction). Therefore, we can use cause-to-effect analysis to identify the potential bias-inducing features, even when these features are originally are unknown. This is important since most current methods require a pre-identification of sensitive features for bias assessment and can actually miss other relevant bias-inducing features, which is why systematic identification of such features is necessary. Moreover, it often occurs that to achieve an equitable outcome, one has to take into account sensitive features in the model decision. Therefore, it should be up to the domain experts to decide based on their knowledge of the context of a decision whether bias induced by specific features is acceptable or not. In this study, we propose an approach for systematically identifying all bias-inducing features of a model to help support the decision-making of domain experts. We evaluated our technique using four well-known datasets to showcase how our contribution can help spearhead the standard procedure when developing, testing, maintaining, and deploying fair/equitable machine learning systems.

A survey on measuring indirect discrimination in machine learning

Fairness-aware machine learning: a perspective

A causal framework for discovering and removing direct and indirect discrimination

Conscientious Classification: A Data Scientist's Guide to Discrimination-Aware Classification

Shedding light on underrepresentation and Sampling Bias in machine learning

A Survey on Bias and Fairness in Machine Learning

Managing bias and unfairness in data for decision support: a survey of machine learning and data engineering approaches to identify and mitigate bias and unfairness within data management and analytics systems

Why Is My Classifier Discriminatory?

Modeling Techniques for Machine Learning Fairness: A Survey

Awareness in Practice: Tensions in Access to Sensitive Attribute Data for Antidiscrimination

Understanding Bias in Machine Learning

Aleatoric and Epistemic Discrimination: Fundamental Limits of Fairness Interventions

Detection and Evaluation of bias-inducing Features in Machine learning

Exposing Algorithmic Discrimination and Its Consequences in Modern Society: Insights from a Scoping Study

Multi-dimensional discrimination in Law and Machine Learning -- A comparative overview

Metrics and methods for a systematic comparison of fairness-aware machine learning algorithms

A Comprehensive Empirical Study of Bias Mitigation Methods for Machine Learning Classifiers

A Discussion of Discrimination and Fairness in Insurance Pricing

Bias Mitigation for Machine Learning Classifiers: A Comprehensive Survey

Do the Machine Learning Models on a Crowd Sourced Platform Exhibit Bias? An Empirical Study on Model Fairness