Abstract:Deep Learning (DL) has made significant strides in various medical applications in recent years, achieving remarkable results. In the field of medical imaging, DL models can assist doctors in disease diagnosis by classifying pathologies in Chest X-ray images. However, training on new data to expand model capabilities and adapt to distribution shifts is a notable challenge these models face. Continual Learning (CL) has emerged as a solution to this challenge, enabling models to adapt to new data while retaining knowledge gained from previous experiences. Previous studies have analyzed the behavior of CL strategies in medical imaging regarding classification performance. However, when considering models that interact with sensitive information, such as in the medical domain, it is imperative to disaggregate the performance of socially salient groups. Indeed, DL algorithms can exhibit biases against certain sub-populations, leading to discrepancies in predictive performance across different groups identified by sensitive attributes such as age, race/ethnicity, sex/gender, and socioeconomic status. In this study, we go beyond the typical assessment of classification performance in CL and study bias evolution over successive tasks with domain-specific fairness metrics. Specifically, we evaluate the CL strategies using the well-known CheXpert (CXP) and ChestX-ray14 (NIH) datasets. We consider a class incremental scenario of five tasks with 12 pathologies. We evaluate the Replay, Learning without Forgetting (LwF), LwF Replay, and Pseudo-Label strategies. LwF and Pseudo-Label exhibit optimal classification performance, but when including fairness metrics in the evaluation, it is clear that Pseudo-Label is less biased. For this reason, this strategy should be preferred when considering real-world scenarios in which it is crucial to consider the fairness of the model.

Fairness-enhancing mixed effects deep learning improves fairness on in- and out-of-distribution clustered (non-iid) data

Fair Distillation: Teaching Fairness from Biased Teachers in Medical Imaging

Fairness Evolution in Continual Learning for Medical Imaging

FairIF: Boosting Fairness in Deep Learning via Influence Functions with Validation Set Sensitive Attributes

FairFix: Enhancing Fairness of Pre-Trained Deep Neural Networks with Scarce Data Resources

Is Fairness Only Metric Deep? Evaluating and Addressing Subgroup Gaps in Deep Metric Learning

Improving Equity in Deep Learning Medical Applications with the Gerchberg-Saxton Algorithm

AdaFair: Cumulative Fairness Adaptive Boosting

Improving Fairness in AI Models on Electronic Health Records: The Case for Federated Learning Methods

FairFML: Fair Federated Machine Learning with a Case Study on Reducing Gender Disparities in Cardiac Arrest Outcome Prediction

FaiR-N: Fair and Robust Neural Networks for Structured Data

FairDR: Ensuring Fairness in Mixed Data of Fairly and Unfairly Treated Instances.

Algorithmic fairness and bias mitigation for clinical machine learning with deep reinforcement learning

Federated Learning Meets Fairness and Differential Privacy

Improving the Fairness of Deep-Learning, Short-term Crime Prediction with Under-reporting-aware Models

Fairness in Cardiac MR Image Analysis: An Investigation of Bias Due to Data Imbalance in Deep Learning Based Segmentation

Deep Fair Discriminative Clustering

Towards A Holistic View of Bias in Machine Learning: Bridging Algorithmic Fairness and Imbalanced Learning

Simultaneous Improvement of ML Model Fairness and Performance by Identifying Bias in Data

FairDisCo: Fairer AI in Dermatology via Disentanglement Contrastive Learning

Fairness And Performance In Harmony: Data Debiasing Is All You Need