Phylogenetic analyses of 41 Y-STRs and machine learning-based haplogroup prediction in the Qingdao Han population from Shandong province, Eastern China

Guang-Yao Fan,De-Zhi Jiang,Yao-Heng Jiang,Wei Song,Ying-Yun He,Nixon Austin Wuo
DOI: https://doi.org/10.1080/03014460.2023.2168057
Abstract:Background: Known for its rich history and culture, Qingdao is a typical symbol of Chinese maritime culture. Its unique genetic landscape has aroused interest among geneticists and forensic scientists. However, the genetic landscape of Qingdao has never been uncovered. Aim: This investigation intends to provide light on Qingdao's paternal genetic diversity and its evolutionary connections to other Han subgroups. Subjects and methods: The genetic polymorphisms of 41 Y-chromosomal short tandem repeat (STR) loci in the Qingdao Han were investigated using SureID® PathFinder Plus Kit. Phylogenetic studies were performed using genotype data from 52 East Asian groups at 23 common Y-STR loci. A multidimensional scaling plot and cladogram were constructed. Linear Discriminant Analysis (LDA) was carried out for predicting categories among the Han people. The k-nearest neighbour (kNN) algorithm was utilised to designate Y-SNP haplogroups for each haplotype. Results: The Qingdao Han were genetically far from the Tibeto-Burman populations and close with the Han people from northern China. LDA indicated a deep integration among the present-day Han people. By the kNN model, the predicted O2a2 and O2a1 were shown to be the predominant Y-SNP haplogroups. Conclusions: This study would be helpful for reconstructing the patrilineal history in China and establishing a more comprehensive Y-STR database.
What problem does this paper attempt to address?