Inferring disease risk genes from sequencing data in multiplex pedigrees through sharing of rare variants

Alexandre Bureau,Ferdouse Begum,Margaret A. Taub,Jacqueline B. Hetmanski,Margaret M. Parker,Hasan Albacha‐Hejazi,Alan F. Scott,Jeffrey C. Murray,Mary L. Marazita,Joan E. Bailey‐Wilson,Terri H. Beaty,Ingo Ruczinski,Hasan Albacha-Hejazi,Joan E. Bailey-Wilson
DOI: https://doi.org/10.1002/gepi.22155
2018-09-24
Genetic Epidemiology
Abstract:We previously demonstrated how sharing of rare variants (RVs) in distant affected relatives can be used to identify variants causing a complex and heterogeneous disease. This approach tested whether single RVs were shared by all sequenced affected family members. However, as with other study designs, joint analysis of several RVs (e.g., within genes) is sometimes required to obtain sufficient statistical power. Further, phenocopies can lead to false negatives for some causal RVs if complete sharing among affected is required. Here, we extend our methodology (Rare Variant Sharing, RVS) to address these issues. Specifically, we introduce gene-based analyses, a partial sharing test based on RV sharing probabilities for subsets of affected relatives and a haplotype-based RV definition. RVS also has the desirable feature of not requiring external estimates of variant frequency or control samples, provides functionality to assess and address violations of key assumptions, and is available as open source software for genome-wide analysis. Simulations including phenocopies, based on the families of an oral cleft study, revealed the partial and complete sharing versions of RVS achieved similar statistical power compared with alternative methods (RareIBD and the Gene-Based Segregation Test), and had superior power compared with the pedigree Variant Annotation, Analysis, and Search Tool (pVAAST) linkage statistic. In studies of multiplex cleft families, analysis of rare single nucleotide variants in the exome of 151 affected relatives from 54 families revealed no significant excess sharing in any one gene, but highlighted different patterns of sharing revealed by the complete and partial sharing tests.
genetics & heredity,mathematical & computational biology
What problem does this paper attempt to address?