Graph-Based Imputation Methods and Their Applications to Single Donors and Families

Sapir Israeli,Martin Maiers,Yoram Louzoun
DOI: https://doi.org/10.1007/978-1-0716-3874-3_13
Abstract:The outcome of Hematopoietic Stem Cell (HSCT) and organ transplant is strongly affected by the matching of the HLA alleles of the donor and the recipient. However, donors and sometimes recipients are often typed at low resolution, with some alleles either missing or ambiguous. Thus, imputation methods are required to detect the most probably high-resolution HLA haplotypes consistent with a typing. Such imputation algorithms require predefined haplotype frequencies. As such, the phasing of the typing is required for both imputation and frequency generation.We have developed a new approach to HLA haplotype and genotype imputation, where first all candidate phases of a typing are explicated, and then the ambiguity within each phase is solved. This ambiguity is solved through a graph structure of all partial haplotypes and the haplotypes consistent with them.This phasing approach was used to produce an imputation algorithm (GRIMM-Graph Imputation and Matching). GRIMM was then combined with the possibility of combining information from multiple races to produce MR-GRIMM (Multi-Race GRIMM). When family information is available, the phasing of each family member can be restricted by the others. We propose GRAMM (GRaph-bAsed faMily iMputation) to phase alleles in family pedigree HLA typing data and in mother-cord blood unit pairs. Finally, we combined MR-GRIMM with an expectation-maximization (EM) algorithm to estimate haplotype frequencies sharing information between races to produce MR-GRIMME (MR-GRIMM EM).We have shown that these algorithms naturally combine information between races and family members. The accuracy of each of these algorithms is significantly better than its current parallel methods. MR-GRIMM leads to high accuracy in matching predictions. GRAMM better imputes family members than either MR-GRIMM or any existing algorithm and has practically no phasing errors. MR-GRIMME obtains a higher likelihood than existing algorithms.MR-GRIMM, MR-GRIMME, and GRAMM are available as servers or through stand-alone versions in GITHUB and PyPi, as detailed in the appropriate sections.
What problem does this paper attempt to address?