FairMOE: counterfactually-fair mixture of experts with levels of interpretability

Joe Germino,Nuno Moniz,Nitesh V. Chawla
DOI: https://doi.org/10.1007/s10994-024-06583-2
IF: 5.414
2024-07-10
Machine Learning
Abstract:With the rise of artificial intelligence in our everyday lives, the need for human interpretation of machine learning models' predictions emerges as a critical issue. Generally, interpretability is viewed as a binary notion with a performance trade-off. Either a model is fully-interpretable but lacks the ability to capture more complex patterns in the data, or it is a black box. In this paper, we argue that this view is severely limiting and that instead interpretability should be viewed as a continuous domain-informed concept. We leverage the well-known Mixture of Experts architecture with user-defined limits on non-interpretability. We extend this idea with a counterfactual fairness module to ensure the selection of consistently fair experts: FairMOE . We perform an extensive experimental evaluation with fairness-related data sets and compare our proposal against state-of-the-art methods. Our results demonstrate that FairMOE is competitive with the leading fairness-aware algorithms in both fairness and predictive measures while providing more consistent performance, competitive scalability, and, most importantly, greater interpretability.
computer science, artificial intelligence
What problem does this paper attempt to address?