Estimating pairwise relatedness in a small sample of individuals

J Wang
DOI: https://doi.org/10.1038/hdy.2017.52
IF: 3.832
2017-08-30
Heredity
Abstract:The genetic relatedness between individuals because of their recent common ancestry is now routinely estimated from marker genotype data in molecular ecology, evolutionary biology and conservation studies. The estimators developed for this purpose assume that marker allele freque218 in a population are known without errors. Unfortunately, however, these frequencies, upon which both the definition and the estimation of relatedness are based, are rarely known in reality. Frequently, the only data available in a relatedness analysis are a sample of multilocus genotypes from which both allele frequencies and relatedness must be deduced. Furthermore, because of various constraints, sample sizes of individuals can be quite small (say <50 individuals) in practice. This study shows, for the first time, that the widely used relatedness estimators become severely biased when they use allele frequencies calculated from an extremely small sample (say <10 individuals). The extent of bias depends on the sample size, the (unknown) population allele frequencies, the actual relatedness and the estimators. It also shows that relatedness estimators become even more biased when they use allele frequencies calculated from a sample by omitting a focal pair of individuals whose relatedness is being estimated. This study modifies two estimators to suit small samples and shows, both analytically and by analysing simulated and empirical data, that the two modified estimators are much less biased, more precise and more accurate than the original estimators. These performance advantages of the modified estimators are shown to increase with a decreasing sample size of individuals and with an increasing value of actual relatedness.
genetics & heredity,ecology,evolutionary biology
What problem does this paper attempt to address?