Abstract:OBJECTIVE: Genotype imputation is a commonly used technique that infers un-typed variants into a study's genotype data, allowing better identification of causal variants in disease studies. However, due to overrepresentation of Caucasian studies, there's a lack of understanding of genetic basis of health-outcomes in other ethnic populations. Therefore, facilitating imputation of missing key-predictor-variants that can potentially improve a risk health-outcome prediction model, specifically for Asian ancestry, is of utmost relevance.METHODS: We aimed to construct an imputation and analysis web-platform, that primarily facilitates, but is not limited to genotype imputation on East-Asians. The goal is to provide a collaborative imputation platform for researchers in the public domain towards rapidly and efficiently conducting accurate genotype imputation.RESULTS: We present an online genotype imputation platform, Multi-ethnic Imputation System (MI-System) (https://misystem.cgm.ntu.edu.tw/), that offers users 3 established pipelines, SHAPEIT2-IMPUTE2, SHAPEIT4-IMPUTE5, and Beagle5.1 for conducting imputation analyses. In addition to 1000 Genomes and Hapmap3, a new customized Taiwan Biobank (TWB) reference panel, specifically created for Taiwanese-Chinese ancestry is provided. MI-System further offers functions to create customized reference panels to be used for imputation, conduct quality control, split whole genome data into chromosomes, and convert genome builds.CONCLUSION: Users can upload their genotype data and perform imputation with minimum effort and resources. The utility functions further can be utilized to preprocess user uploaded data with easy clicks. MI-System potentially contributes to Asian-population genetics research, while eliminating the requirement for high performing computational resources and bioinformatics expertise. It will enable an increased pace of research and provide a knowledge-base for genetic carriers of complex diseases, therefore greatly enhancing patient-driven research.STATEMENT OF SIGNIFICANCE: Multi-ethnic Imputation System (MI-System), primarily facilitates, but is not limited to, imputation on East-Asians, through 3 established prephasing-imputation pipelines, SHAPEIT2-IMPUTE2, SHAPEIT4-IMPUTE5, and Beagle5.1, where users can upload their genotype data and perform imputation and other utility functions with minimum effort and resources. A new customized Taiwan Biobank (TWB) reference panel, specifically created for Taiwanese-Chinese ancestry is provided. Utility functions include (a) create customized reference panels, (b) conduct quality control, (c) split whole genome data into chromosomes, and (d) convert genome builds. Users can also combine 2 reference panels using the system and use combined panels as reference to conduct imputation using MI-System.

A fast data-driven method for genotype imputation, phasing and local ancestry inference: MendelImpute.jl

A Fast Data-Driven Method for Genotype Imputation, Phasing, and Local Ancestry Inference: MendelImpute.jl

A Fast and Flexible Statistical Model for Large-Scale Population Genotype Data: Applications to Inferring Missing Genotypes and Haplotypic Phase

FastImpute: A Baseline for Open-source, Reference-Free Genotype Imputation Methods -- A Case Study in PRS313

Multi-ethnic Imputation System (MI-System): A genotype imputation server for high-dimensional data

Genotype imputation using the Positional Burrows Wheeler Transform

A New Genotype Imputation Method with Tolerance to High Missing Rate and Rare Variants

MaCH-admix: Genotype Imputation for Admixed Populations.

Simpute: an Efficient Solution for Dense Genotypic Data

Simpute: A Simple Genotype Imputation Method

Fast and accurate haplotype inference with hidden markov model

Rapid and accurate genotype imputation from low coverage short read, long read, and cell free DNA sequence

Fast and accurate imputation of genotypes from noisy low-coverage sequencing data in bi-parental populations

Integer programming framework for pangenome-based genome inference

Graph-Based Imputation Methods and Their Applications to Single Donors and Families

A General Approach for Haplotype Phasing across the Full Spectrum of Relatedness

Accurate Haplotype Inference for Multiple Linked Single-Nucleotide Polymorphisms Using Sibship Data

Local Haplotype Classifiers enable Efficient, Flexible, and Secure Genotype Imputation and Downstream Analyses

MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes

Phasing millions of samples achieves near perfect accuracy, enabling parent-of-origin classification of variants

SNP Genotype Imputation in Forensics—A Performance Study