Abstract:Understanding causality is of crucial importance in biomedical sciences, where developing prediction models is insufficient because the models need to be actionable. However, data sources, such as electronic health records, are observational and often plagued with various types of biases, e.g. confounding. Although randomized controlled trials are the gold standard to estimate the causal effects of treatment interventions on health outcomes, they are not always possible. Propensity score matching (PSM) is a popular statistical technique for observational data that aims at balancing the characteristics of the population assigned either to a treatment or to a control group, making treatment assignment and outcome independent upon these characteristics. However, matching subjects can reduce the sample size. Inverse probability weighting (IPW) maintains the sample size, but extreme values can lead to instability. While PSM and IPW have been historically used in conjunction with linear regression, machine learning methods -including deep learning with propensity dropout- have been proposed to account for nonlinear treatment assignments. In this work, we propose a novel deep learning approach -the Propensity Score Synthetic Augmentation Matching using Generative Adversarial Networks (PSSAM-GAN)- that aims at keeping the sample size, without IPW, by generating synthetic matches. PSSAM-GAN can be used in conjunction with any other prediction method to estimate treatment effects. Experiments performed on both semi-synthetic (perinatal interventions) and real-world observational data (antibiotic treatments, and job interventions) show that the PSSAM-GAN approach effectively creates balanced datasets, relaxing the weighting/dropout needs for downstream methods, and providing competitive performance in effects estimation as compared to simple GAN and in conjunction with other deep counterfactual learning architectures, e.g. TARNet.

Matched sample selection with GANs for mitigating attribute confounding

Dual Distribution Matching GAN

Two Birds with One Stone: Iteratively Learn Facial Attributes with GANs.

Propensity Score Synthetic Augmentation Matching Using Generative Adversarial Networks (PSSAM-GAN).

Enhancing Stability in Training Conditional Generative Adversarial Networks via Selective Data Matching

Generative Adversarial Networks for Mitigating Biases in Machine Learning Systems

MatchGAN: A Self-Supervised Semi-Supervised Conditional Generative Adversarial Network

Visualizing chest X-ray dataset biases using GANs

Removing Class Imbalance using Polarity-GAN: An Uncertainty Sampling Approach

Imperfect ImaGANation: Implications of GANs Exacerbating Biases on Facial Data Augmentation and Snapchat Selfie Lenses

Self-Conditioned GANs for Image Editing

The (de)biasing effect of GAN-based augmentation methods on skin lesion images

Mitigating Dataset Imbalance via Joint Generation and Classification

Examining Pathological Bias in a Generative Adversarial Network Discriminator: A Case Study on a StyleGAN3 Model

Class Balancing GAN with a Classifier in the Loop

A Constructive GAN-based Approach to Exact Estimate Treatment Effect without Matching

SMaRt: Improving GANs with Score Matching Regularity

Considering How Machine‐Learning Algorithms (Re)produce Social Biases in Generated Faces

An Assessment of GANs for Identity-related Applications

Matchinggan: Matching-Based Few-Shot Image Generation

Improving the Fairness of Deep Generative Models without Retraining