Estimation of inbreeding and kinship coefficients via latent identity-by-descent states

Yongtao Guan,Daniel Levy
DOI: https://doi.org/10.1093/bioinformatics/btae082
IF: 5.8
2024-02-01
Bioinformatics
Abstract:Abstract Motivation Estimating the individual inbreeding coefficient and pairwise kinship is an important problem in human genetics (e.g. in disease mapping) and in animal and plant genetics (e.g. inbreeding design). Existing methods, such as sample correlation-based genetic relationship matrix, KING, and UKin, are either biased, or not able to estimate inbreeding coefficients, or produce a large proportion of negative estimates that are difficult to interpret. This limitation of existing methods is partly due to failure to explicitly model inbreeding. Since all humans are inbred to various degrees by virtue of shared ancestries, it is prudent to account for inbreeding when inferring kinship between individuals. Results We present “Kindred,” an approach that estimates inbreeding and kinship by modeling latent identity-by-descent states that accounts for all possible allele sharing—including inbreeding—between two individuals. Kindred used non-negative least squares method to fit the model, which not only increases computation efficiency compared to the maximum likelihood method, but also guarantees non-negativity of the kinship estimates. Through simulation, we demonstrate the high accuracy and non-negativity of kinship estimates by Kindred. By selecting a subset of SNPs that are similar in allele frequencies across different continental populations, Kindred can accurately estimate kinship between admixed samples. In addition, we demonstrate that the realized kinship matrix estimated by Kindred is effective in reducing genomic control values via linear mixed model in genome-wide association studies. Finally, we demonstrate that Kindred produces sensible heritability estimates on an Australian height dataset. Availability and implementation Kindred is implemented in C with multi-threading. It takes vcf file or stream as input and works seamlessly with bcftools. Kindred is freely available at https://github.com/haplotype/kindred.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?
The problem this paper attempts to address is the accurate estimation of individual inbreeding coefficients and pairwise kinship in human genetics. Existing methods such as sample correlation-based genetic relationship matrices, KING, and UKin, have issues such as bias, inability to estimate inbreeding coefficients, or generating a large number of difficult-to-interpret negative estimates. These problems are partly due to the failure of existing methods to explicitly model inbreeding. Therefore, this paper proposes a new method—Kindred, which estimates inbreeding and kinship by modeling latent identity-by-descent states, thereby overcoming the limitations of existing methods. Specifically, Kindred uses non-negative least squares to fit the model, which not only improves computational efficiency but also ensures the non-negativity of kinship estimates. Through simulation experiments, the researchers demonstrated the high accuracy and non-negativity of Kindred in kinship estimation and proved its effectiveness in estimating kinship among mixed samples and in genome-wide association studies.