A Robust Approach for Genetic Mapping of Complex Traits
Rongling Wu,Song Wu
2008-01-01
Abstract:Genetic mapping has proven to be a powerful tool for studying the genetic architecture of complex traits by localizing individual quantitative trait loci (QTLs) that underlie the traits. For this reason, the past two decades have witnessed a surge of interest in developing statistical methods for QTL mapping. The central theme of QTL mapping is to assign each individual with all possible genotypes at the unobservable QTL in a probability determined from observed markers linked to the QTL and then model the phenotypic distribution of each QTL genotype by parametric or non-parametric approaches. All the statistical mapping methods rely on several key assumptions, which include (1) markers and QTL are segregating in an equilibrium state, (2) for a continuous trait, it obeys a parametric distribution, and (3) there is a direct relationship between genotypes and phenotypes. It is possible that violation of each of these assumptions will lead to biased inference about QTL locations and effects and to spurious QTL discoveries. In this dissertation, I will derive a battery of robust statistical approaches for QTL mapping, which do not rely on these assumptions and push mapping work toward more practical settings. Assumption 1 states that the segregation of genetic loci does not deviate from Mendel's first law in an experimental cross or from Hardy-Weinberg equilibrium (HWE) in a natural population. In a cross, differences in viability may occur among gametes or zygotes due to some unknown mechanisms, leading to distorted segregation. In a natural population affected by various evolutionary forces, individuals may not be randomly mating, making the population at Hardy-Weinberg disequilibrium (HWD). I will develop a general framework for relaxing this equilibrium assumption. By focusing on a natural population, I will demonstrate my model framework for relaxing HWE the assumption at the haplotypic and zygotic levels. Assumption 2 requires that the trait distribution can be exactly fit by a parametric function, in that maximum likelihood or Bayesian approaches are formulated for parameter estimation. However, it is not possible to know the true underlying model for observed phenotypes. I will derive a robust approach that is flexible enough to accommodate a certain degree of misspecification of the true model. Here, I will incorporate the idea of integrated square errors or L2E into the genetic mapping framework and formulate the hypothesis testing by defining a new test statistic—Energy Difference. Assumption 3 suggests that the outcome phenotype of a complex trait is determined by causal QTLs in a direct way, neglecting the biological pathway or process of trait formation on a time or space scale. A statistical model, called functional mapping, has been proposed to model the dynamic pattern of the genetic control of a trait in time course by biologically meaningful mathematical equations, aimed to relax the assumption of direct genotype-phenotype relationships. Here, I will extend the L2E approach to functional mapping by relaxing the widely used multivariate normality assumption, greatly expanding the breadth of use of functional mapping. My dissertation is divided into three parts each corresponding to an assumption mentioned above. In each part, I first formulate a general likelihood function, derive computing algorithms for parameter estimation, provide and prove theorems behind a typical issue. Then, I perform extensive simulation studies to investigate the statistical properties of each approach and compare the results from my newly derived approaches with those from traditional ones. Lastly, analyses of real examples are conducted to demonstrate the usefulness and utilization of the new approaches in a practical genetic setting. In the last chapter of this dissertation, I discuss several issues pertaining to the future direction of QTL mapping.