Assessing Phenotype Definitions for Algorithmic Fairness

Tony Y. Sun,Shreyas Bhave,Jaan Altosaar,Noémie Elhadad

DOI: https://doi.org/10.48550/arXiv.2203.05174

2022-08-28

Abstract:Disease identification is a core, routine activity in observational health research. Cohorts impact downstream analyses, such as how a condition is characterized, how patient risk is defined, and what treatments are studied. It is thus critical to ensure that selected cohorts are representative of all patients, independently of their demographics or social determinants of health. While there are multiple potential sources of bias when constructing phenotype definitions which may affect their fairness, it is not standard in the field of phenotyping to consider the impact of different definitions across subgroups of patients. In this paper, we propose a set of best practices to assess the fairness of phenotype definitions. We leverage established fairness metrics commonly used in predictive models and relate them to commonly used epidemiological cohort description metrics. We describe an empirical study for Crohn's disease and diabetes type 2, each with multiple phenotype definitions taken from the literature across two sets of patient subgroups (gender and race). We show that the different phenotype definitions exhibit widely varying and disparate performance according to the different fairness metrics and subgroups. We hope that the proposed best practices can help in constructing fair and inclusive phenotype definitions.

Other Quantitative Biology,Artificial Intelligence,Machine Learning

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the possible unfairness in the methods used to define disease phenotypes in medical research. Specifically, the paper focuses on how to evaluate and construct fair phenotype definitions to ensure that these definitions perform consistently among different population subgroups (such as gender, race), thereby avoiding research result biases or the exacerbation of health inequalities caused by improper phenotype definitions. The author points out that in observational health studies, the choice of phenotype definition directly affects the results of subsequent analyses, such as disease characterization, the definition of patient risks, and the study of treatment regimens. Therefore, it is very important to ensure that the selected patient groups can represent all patients without being affected by their demographic characteristics or social determinants. However, existing phenotype definitions may be affected by multiple potential biases, such as diagnostic biases, treatment biases, and biases in access to medical services. These problems may lead to certain groups being wrongly excluded or over - represented. To solve this problem, the paper proposes a set of best - practice methods, aiming to evaluate the fairness of phenotype definitions by combining evaluation metrics in the fields of epidemiology and algorithmic fairness. These methods not only help to identify unfairness in existing phenotype definitions but also provide a framework for researchers to be able to more transparently and intentionally consider the needs and characteristics of different populations when designing new phenotype definitions, thereby promoting more fair and inclusive health research.

Assessing Phenotype Definitions for Algorithmic Fairness

How Do Fairness Definitions Fare? Examining Public Attitudes Towards Algorithmic Definitions of Fairness

A general framework for developing computable clinical phenotype algorithms

Algorithmic fairness in artificial intelligence for medicine and healthcare

Assessing Algorithmic Fairness with Unobserved Protected Class Using Data Combination

Towardcross-Platformelectronic Health Record-Drivenphenotyping Using Clinical Quality Language

What is Fair? Defining Fairness in Machine Learning for Health

Desiderata for Computable Representations of Electronic Health Records-Driven Phenotype Algorithms

Algorithm Fairness in AI for Medicine and Healthcare

Racial Bias in Clinical and Population Health Algorithms: A Critical Review of Current Debates

Developing a Robust Computable Phenotype Definition Workflow to Describe Health and Disease in Observational Health Research

Impact of Diverse Data Sources on Computational Phenotyping

Fair Machine Learning for Healthcare Requires Recognizing the Intersectionality of Sociodemographic Factors, a Case Study

Talking about diseases; developing a model of patient and public-prioritised disease phenotypes

An Empirical Characterization of Fair Machine Learning For Clinical Risk Prediction

Achieve Fairness without Demographics for Dermatological Disease Diagnosis

Reporting of Fairness Metrics in Clinical Risk Prediction Models: A Call for Change

Conceptualising fairness: three pillars for medical algorithms and health equity

Evaluating the impact of alternative phenotype definitions on incidence rates across a global data network

Designing Equitable Algorithms

Human Perceptions of Fairness in Algorithmic Decision Making: A Case Study of Criminal Risk Prediction