Increasing Representativeness in the All of Us Cohort Using Inverse Probability Weighting

Manoj S. Kambara,Shivam Sharma,John L Spouge,I. King Jordan,Leonardo Mariño-Ramírez
DOI: https://doi.org/10.1101/2024.10.02.24314774
2024-10-02
Abstract:Large-scale population biobanks rely on volunteer participants, which may introduce biases that compromise the external validity of epidemiological studies. We characterized the volunteer participant bias for the All of Us Research Program cohort and developed a set of inverse probability (IP) weights that can be used to mitigate this bias. The All of Us cohort is older, more female, more educated, more likely to be covered by health insurance, less White, less likely to drink or smoke, and less healthy compared to the US population. IP weights developed via comparison of a nationally representative database eliminated the observed biases for all demographic and lifestyle characteristics and reduced the observed disease prevalence differences. IP weights also impact genetic associations with type 2 diabetes across diverse ancestry cohorts. We provide our IP weights as a community resource to increase the representativeness and external validity of the All of Us cohort.
What problem does this paper attempt to address?