Causal Integration of Multiple Cancer Cohorts with High-Dimensional Confounders: Bayesian Propensity Score Estimation

Subharup Guha,Yang Li
DOI: https://doi.org/10.48550/arxiv.2312.07873
2023-01-01
Abstract:Comparative meta-analyses of patient groups by integrating multiple observational studies rely on estimated propensity scores (PSs) to mitigate confounder imbalances. However, PS estimation grapples with the theoretical and practical challenges posed by high-dimensional confounders. Motivated by an integrative analysis of breast cancer patients across seven medical centers, this paper tackles the challenges associated with integrating multiple observational datasets and offering nationally interpretable results. The proposed inferential technique, called Bayesian Motif Submatrices for Confounders (B-MSMC), addresses the curse of dimensionality by a hybrid of Bayesian and frequentist approaches. B-MSMC uses nonparametric Bayesian ``Chinese restaurant" processes to eliminate redundancy in the high-dimensional confounders and discover latent motifs or lower-dimensional structure. With these motifs as potential predictors, standard regression techniques can be utilized to accurately infer the PSs and facilitate causal group comparisons. Simulations and meta-analysis of the motivating cancer investigation demonstrate the efficacy of our proposal in high-dimensional causal inference by integrating multiple observational studies; using different weighting methods, we apply the B-MSMC approach to efficiently address confounding when integrating observational health studies with high-dimensional confounders.
What problem does this paper attempt to address?