Abstract:Joint analysis of multiple correlated phenotypes for genome-wide association studies (GWAS) can identify and interpret pleiotropic loci which are essential to understand pleiotropy in diseases and complex traits. Meanwhile, constructing a network based on associations between phenotypes and genotypes provides a new insight to analyze multiple phenotypes, which can explore whether phenotypes and genotypes might be related to each other at a higher level of cellular and organismal organization. In this paper, we first develop a bipartite signed network by linking phenotypes and genotypes into a Genotype and Phenotype Network (GPN). The GPN can be constructed by a mixture of quantitative and qualitative phenotypes and is applicable to binary phenotypes with extremely unbalanced case-control ratios in large-scale biobank datasets. We then apply a powerful community detection method to partition phenotypes into disjoint network modules based on GPN. Finally, we jointly test the association between multiple phenotypes in a network module and a single nucleotide polymorphism (SNP). Simulations and analyses of 72 complex traits in the UK Biobank show that multiple phenotype association tests based on network modules detected by GPN are much more powerful than those without considering network modules. The newly proposed GPN provides a new insight to investigate the genetic architecture among different types of phenotypes. Multiple phenotypes association studies based on GPN are improved by incorporating the genetic information into the phenotype clustering. Notably, it might broaden the understanding of genetic architecture that exists between diagnoses, genes, and pleiotropy. Biological pleiotropy refers to a SNP or gene that has a direct biological influence on more than one phenotypic trait, which can offer significant insights in understanding the complex genotype-phenotype relationships. Network analyses provide an integrative approach to characterize complex genomic associations by linking phenotypes and genotypes into a Genotype and Phenotype Network (GPN). Jointly analyzing multiple phenotypes and incorporating the genetic information into the phenotype clustering may increase the statistical power to discover the cross-phenotype association and pleiotropy. We evaluate our proposed multiple phenotype association tests based on network modules detected by GPN for 72 EHR-derived phenotypes in the diseases of the musculoskeletal system and connective tissue in the UK Biobank. From the post-GWAS analyses, we observe that the test based on GPN can identify more significantly enriched biological pathways than that without considering the network modules. Meanwhile, some of the uniquely identified SNPs by the test based on GPN are also colocalized in the eQTL study of the gene expression in the Muscle Skeletal tissue.

Rapid and accurate multi-phenotype imputation for millions of individuals

FastImpute: A Baseline for Open-source, Reference-Free Genotype Imputation Methods -- A Case Study in PRS313

A New Genotype Imputation Method with Tolerance to High Missing Rate and Rare Variants

Multi-ethnic Imputation System (MI-System): A genotype imputation server for high-dimensional data

Rapid and accurate genotype imputation from low coverage short read, long read, and cell free DNA sequence

Trait Imputation Enhances Nonlinear Genetic Prediction for Some Traits

Multi-trait genome-wide analyses of the brain imaging phenotypes in UK Biobank

High-throughput multimodal automated phenotyping (MAP) with application to PheWAS

Large-scale Genotyping of Complex DNA

Simpute: A Simple Genotype Imputation Method

Efficient multi-phenotype genome-wide analysis identifies genetic associations for unsupervised deep-learning-derived high-dimensional brain imaging phenotypes

Using Family-Based Imputation in Genome-Wide Association Studies with Large Complex Pedigrees: the Framingham Heart Study

Simpute: an Efficient Solution for Dense Genotypic Data

A Fast and Accurate Method for Genome-Wide Time-to-Event Data Analysis and Its Application to UK Biobank

A novel method for multiple phenotype association studies based on genotype and phenotype network

A fast data-driven method for genotype imputation, phasing and local ancestry inference: MendelImpute.jl

Genotype imputation using the Positional Burrows Wheeler Transform

MaCH-admix: Genotype Imputation for Admixed Populations.

Joint analysis of multiple phenotypes for extremely unbalanced case-control association studies

Missing value imputation in high-dimensional phenomic data: imputable or not, and how?

Empowering GWAS Discovery through Enhanced Genotype Imputation