MHBase: a comprehensive database of short microhaplotypes for advancing forensic genetic analysis

Jiaming Xue,Mengyu Tan,Qiushuo Wu,Yazi Zheng,Guihong Liu,Ranran Zhang,Dezhi Chen,Yuanyuan Xiao,Miao Liao,Meli Lv,Shengqiu Qu,Weibo Liang,Lin Zhang
DOI: https://doi.org/10.1016/j.fsigen.2024.103062
IF: 4.453
2024-05-19
Forensic Science International Genetics
Abstract:Microhaplotypes (MHs) were first recommended by Prof. Kidd for use in forensics because they can improve human identification, kinship analysis, mixture deconvolution, and ancestry prediction. Since their introduction, extensive research has demonstrated the advantages of MHs in forensic applications and provided useful data for different populations. Currently, two databases, ALFRED (ALlele FREquency Database) and MicroHapDB (MicroHaplotype DataBase), house the published MH information and population data. We previously constructed a single nucleotide polymorphism SNP-SNP MH database (D-SNPsDB) of MHs within 50 bp on the whole human genome for 26 populations integrating basic data such as physical genome positions, mapping of variant identifiers (rsIDs), allele frequencies, and basic variant information. Building upon the previous research, we further selected MHs containing at least two variants (SNPs and/or insertions/deletions [InDels]) within a short DNA fragment (≤ 50 bp) in 26 populations based on the 1000 Genomes Project dataset (Phase 3) to construct a more comprehensive database. Subsequently, we established a user-friendly website that allows users to search the MH database (MHBase) based on their research objectives and study population to find suitable loci and provides other functions such as querying reported loci, performing online calculations using the PHASE software, and calculating ancestral-related parameters. The loci in the database are classified as SNP-based MHs, which include only SNPs, and InDel-including MHs, which contain at least one InDel. Here, we provide a detailed overview of the MHBase and an analysis of shared loci at the global and continental levels, ancestral markers, the genetic distance within loci, and mapping with the genome annotation file. The website is an accessible and useful tool for researchers engaged in marker discovery, population studies, assay development, and panel design.
genetics & heredity,medicine, legal
What problem does this paper attempt to address?