A Refined Analysis of Neanderthal-Introgressed Sequences in Modern Humans with a Complete Reference Genome
Shen-Ao Liang,Tianxin Ren,Jiayu Zhang,Jiahui He,Xuankai Wang,Xinrui Jiang,Yuan He,Rajiv C. McCoy,Qiaomei Fu,Joshua M. Akey,Yafei Mao,Lu Chen
DOI: https://doi.org/10.1101/2024.08.09.607285
2024-08-10
Abstract:Background: Leveraging long-read sequencing technologies, the first complete human reference genome, T2T-CHM13, corrects the assembly errors in prior references and addresses the remaining 8% of the genome. While the studies on archaic admixture in modern humans so far have been relying on the GRCh37 reference due to the archaic genome data, the impact of T2T-CHM13 in this field remains unknown. Results: We remapped the sequencing reads of the high-quality Altai Neanderthal and Denisovan genomes onto GRCh38 and T2T-CHM13 respectively. Compared with GRCh37, we found T2T-CHM13 has a significant improvement of read mapping quality in archaic samples. We then applied IBDmix to identify Neanderthal introgressed sequences in 2,504 individuals from 26 geographically diverse populations in different references. We observed different pre-phasing filtering strategies prevalently used in public data can largely impact determination of archaic ancestry, calling for consideration on the choice of filters. We discovered ~51Mb T2T-CHM13 unique Neanderthal sequences, which are predominantly located in regions where the variants distinct between the GRCh38 and T2T-CHM13 assemblies emerge. Besides, we unfolded new instances of population-specific archaic introgression in diverse populations, covering genes involved in metabolism, olfactory-related, and icon-channel. Finally, we integrated the introgressed sequences and adaptive signals with all references into a visualization database website, called ASH (www.arcseqhub.com), to facilitate the utilization of archaic alleles and adaptive signals in human genomics and evolutionary research. Conclusions: Our study refines the detection of archaic variations in modern humans, highlights the importance of T2T-CHM13 reference utility, and provides novel insights into functional consequences of archaic hominin admixture.
Evolutionary Biology