Minority populations exhibit distinct clinical and genetic features of celiac disease in the United States

Hemanth Karnati,Wenjing Ying,Xin Long,Mary-Joe Touma,Ioana Smith,Suzanne K Lewis,Chao Xing,Ezra Burstein,Alexandre Bolze,Peter H.R. Green,Michelle J Alkalay,Xiao-Fei Kong
DOI: https://doi.org/10.1101/2024.12.20.24319436
2024-12-24
Abstract:Celiac disease (CeD) is a heterogeneous autoimmune disorder influenced by genetic, environmental, and socioeconomic factors. However, little is known about clinical manifestations and genetic risks in minority populations. Using data from the All of Us Research Program, we analyzed 3,040 CeD patients, referred to as the AoU-CeD cohort, to identify clinical and genetic differences across racial and ethnic groups in the United States. CeD prevalence was highest among White individuals (1.08%) and significantly lower among Hispanic (0.36%) and Black (0.16%) populations. The majority of CeD patients were female (78.4%) and diagnosed between the ages of 18 and 64. Minority groups reported poorer physical and mental quality of life (QoL) and higher levels of pain. Ancestry-specific patterns emerged in CeD-associated conditions, with minorities more likely to report diarrhea and non-infectious gastroenteritis but less likely to have osteoporosis, hypothyroidism, chronic fatigue, or a family history of CeD. Compared to previously reported data showing that over 90% of CeD patients carry the HLA-DQ2.5 haplotype, genetic analysis revealed that only 49% of patients in the AoU-CeD cohort carried the high-risk HLA-DQ2.5 haplotype. Additionally, 16.5% lacked known HLA-DQ risk haplotypes, suggesting potential diagnostic or reporting inaccuracies. Minority groups exhibited higher rates of atypical symptoms, lower frequencies of the DQ2.5 haplotype, and distinct distributions of HLA-DQ genotypes. A long haplotype block spanning HLA-A1, B8, C7 and HLA-DQ2.5 was found in Europeans but absent in other ancestries. A genome-wide association study (GWAS) using over 11 million variants from whole-genome sequencing data identified 1,651 significant single-nucleotide polymorphisms (SNPs), primarily within the MHC locus, with the strongest signals observed predominantly among individuals of European ancestry. A predictive model incorporating HLA-DQ genotype, family history, and clinical features achieved 83% accuracy for identifying seropositive CeD. These results highlight the importance of ancestry-specific clinical presentations and genetic features in CeD.
What problem does this paper attempt to address?