Long-read sequencing of 945 Han individuals identifies novel structural variants associated with phenotypic diversity and disease susceptibility

Jiao Gong,Huiru Sun,Kaiyuan Wang,Yanhui Zhao,Yechao Huang,Qinsheng Chen,Hui Qiao,Yang Gao,Jialin Zhao,Yunchao Ling,Ruifang Cao,Jingze Tan,Qi Wang,Yanyun Ma,Jing Li,Jingchun Luo,Sijia Wang,Jiucun Wang,Guoqing Zhang,Shuhua Xu,Feng Qian,Fang Zhou,Huiru Tang,Dali Li,Fritz J Sedlazeck,Li Jin,Yuting Guan,Shaohua Fan
DOI: https://doi.org/10.1101/2024.03.21.24304654
2024-03-22
Abstract:Genomic structural variants (SVs) are a major source of genetic diversity in humans. Although numerous studies explore SV diversity across global populations and their potential impacts , validation using model systems are needed to confirm the reported genotype-phenotype associations. Here, through long-read sequencing of 945 Han Chinese genomes, we identify 111,288 SVs, including 24.56% unreported variants, many predicted to be functionally important. Our analysis unveils the multifaceted origins of these SVs within the Han population, with approximately 24% emerging at the common ancestor of modern humans. By integrating human population-level phenotypic, metabolic, and immunologic data and two humanized mouse models, we demonstrate the causal roles of two SVs: one SV that emerges at the common ancestor of modern human and Neanderthal and Denisova in for bone density and one modern-human-specific SV in impacting height, weight, fat, craniofacial phenotypes, and innate immunity. Some of these phenotypes were previously unreported and irreproducible phenotypes in mouse knockout experiments. Our results suggest that the SV in could serve as a rapid and cost-effective predictive biomarker for evaluating GSDMD-mediated pyroptosis in multiple organ injuries, including cisplatin-induced acute kidney injury. While initially identified in the Han, the functional conservation from human to mouse, signals of positive natural selection specifically in non-African populations including the Han, and associations with multiple disease risks, suggest that both SVs likely influence local adaptation, phenotypic diversity, and disease susceptibility across many non-African populations.
Genetic and Genomic Medicine
What problem does this paper attempt to address?