Assessing Phenotype Definitions for Algorithmic Fairness

Tony Y. Sun,Shreyas Bhave,Jaan Altosaar,Noémie Elhadad
DOI: https://doi.org/10.48550/arXiv.2203.05174
2022-08-28
Abstract:Disease identification is a core, routine activity in observational health research. Cohorts impact downstream analyses, such as how a condition is characterized, how patient risk is defined, and what treatments are studied. It is thus critical to ensure that selected cohorts are representative of all patients, independently of their demographics or social determinants of health. While there are multiple potential sources of bias when constructing phenotype definitions which may affect their fairness, it is not standard in the field of phenotyping to consider the impact of different definitions across subgroups of patients. In this paper, we propose a set of best practices to assess the fairness of phenotype definitions. We leverage established fairness metrics commonly used in predictive models and relate them to commonly used epidemiological cohort description metrics. We describe an empirical study for Crohn's disease and diabetes type 2, each with multiple phenotype definitions taken from the literature across two sets of patient subgroups (gender and race). We show that the different phenotype definitions exhibit widely varying and disparate performance according to the different fairness metrics and subgroups. We hope that the proposed best practices can help in constructing fair and inclusive phenotype definitions.
Other Quantitative Biology,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the possible unfairness in the methods used to define disease phenotypes in medical research. Specifically, the paper focuses on how to evaluate and construct fair phenotype definitions to ensure that these definitions perform consistently among different population subgroups (such as gender, race), thereby avoiding research result biases or the exacerbation of health inequalities caused by improper phenotype definitions. The author points out that in observational health studies, the choice of phenotype definition directly affects the results of subsequent analyses, such as disease characterization, the definition of patient risks, and the study of treatment regimens. Therefore, it is very important to ensure that the selected patient groups can represent all patients without being affected by their demographic characteristics or social determinants. However, existing phenotype definitions may be affected by multiple potential biases, such as diagnostic biases, treatment biases, and biases in access to medical services. These problems may lead to certain groups being wrongly excluded or over - represented. To solve this problem, the paper proposes a set of best - practice methods, aiming to evaluate the fairness of phenotype definitions by combining evaluation metrics in the fields of epidemiology and algorithmic fairness. These methods not only help to identify unfairness in existing phenotype definitions but also provide a framework for researchers to be able to more transparently and intentionally consider the needs and characteristics of different populations when designing new phenotype definitions, thereby promoting more fair and inclusive health research.