CD59 gene: 143 haplotypes of 22,718 nucleotides length by computational phasing in 113 individuals from different ethnicities

Kshitij Srivastava,Qinan Yin,Addisalem Taye Makuria,Maria Rios,Amha Gebremedhin,Willy Albert Flegel
DOI: https://doi.org/10.1111/trf.17869
2024-05-31
Transfusion
Abstract:Background CD59 deficiency due to rare germline variants in the CD59 gene causes disabilities, ischemic strokes, neuropathy, and hemolysis. CD59 deficiency due to common somatic variants in the PIG‐A gene in hematopoietic stem cells causes paroxysmal nocturnal hemoglobinuria. The ISBT database lists one nonsense and three missense germline variants that are associated with the CD59‐null phenotype. To analyze the genetic diversity of the CD59 gene, we determined long‐range CD59 haplotypes among individuals from different ethnicities. Methods We determined a 22.7 kb genomic fragment of the CD59 gene in 113 individuals using next‐generation sequencing (NGS), which covered the whole NM_203330.2 mRNA transcript of 7796 base pairs. Samples came from an FDA reference repository and our Ethiopia study cohorts. The raw genotype data were computationally phased into individual haplotype sequences. Results Nucleotide sequencing of the CD59 gene of 226 chromosomes identified 216 positions with single nucleotide variants. Only three haplotypes were observed in homozygous form, which allowed us to assign them unambiguously as experimentally verified CD59 haplotypes. They were also the most frequent haplotypes among both cohorts. An additional 140 haplotypes were imputed computationally. Discussion We provided a large set of haplotypes and proposed three verified long‐range CD59 reference sequences, based on a population approach, using a generalizable rationale for our choice. Correct long‐range haplotypes are useful as template sequences for allele calling in high‐throughput NGS and precision medicine approaches, thus enhancing the reliability of clinical diagnostics. Long‐range haplotypes can also be used to evaluate the influence of genetic variation on the risk of transfusion reactions or diseases.
hematology
What problem does this paper attempt to address?