Ancestry Analysis Using a Self-Developed 56 AIM-InDel Panel and Machine Learning Methods

Liu,Shuanglin Li,Wei Cui,Yating Fang,Shuyan Mei,Man Chen,Hui Xu,Xiaole Bai,Bofeng Zhu
DOI: https://doi.org/10.1016/j.forsciint.2024.112065
IF: 2.676
2024-01-01
Forensic Science International
Abstract:Insertion/deletion (InDel) polymorphisms can be used as one of the ancestry-informative markers in ancestry analysis. In this study, a self-developed panel consisting of 56 ancestry-informative InDels was used to investigate the genetic structures and genetic relationships between Chinese Inner Mongolia Manchu group and 26 reference populations. The Inner Mongolia Manchu group was closely related in genetic background to East Asian populations, especially the Han Chinese in Beijing. Moreover, populations from northern and southern East Asia displayed obvious variations in ancestral components, suggesting the potential value of this panel in distinguishing the populations from northern and southern East Asia. Subsequently, four machine learning models were performed based on the 56 AIM-InDel loci to evaluate the performance of this panel in ancestry prediction. The random forest model presented better performance in ancestry prediction, with 91.87% and 99.73% accuracy for the five and three continental populations, respectively. The individuals of the Inner Mongolia Manchu group were assigned to the East Asian populations by the random forest model, and they exhibited closer genetic affinities with northern East Asian populations. Furthermore, the random forest model distinguished 87.18% of the Inner Mongolia Manchus from the East Asian populations, suggesting that the random forest model based on the 56 ancestry-informative InDels could be a potential tool for ancestry analysis.
What problem does this paper attempt to address?