Initiation of the Primate Genome Project

Dong-Dong Wu,Xiao-Guang Qi,Li Yu,Ming Li,Zhi-Jin Liu,Anne D Yoder,Christian Roos,Takashi Hayakawa,Jeffrey Rogers,Tomas Marques-Bonet,Bing Su,Yong-Gang Yao,Ya-Ping Zhang,Guojie Zhang
DOI: https://doi.org/10.24272/j.issn.2095-8137.2022.001
2022-01-01
Zoological Research
Abstract:A crucial step for understanding human evolution is to identify the genomic changes that occurred during primate evolution, thus allowing investigators to reconstruct the ancestral states preceding the human condition. In the past several decades, the primate clade has been a research focus in genome sequencing due to its unique phylogenetic position and key importance. Comparative genomic analyses of several primate lineages have radically expanded our knowledge on the tempo and mode of different features in primate genome evolution, revealing many genomic innovations contributing to the development and evolution of human phenotypes. However, with less than 10% of primate species currently sequenced, a considerable gap remains regarding the evolutionary history of every base pair in human and non-human primate (NHP) genomes. To fill this gap, we propose to organize and establish the Primate Genome Project (PGP) to scale up the number of high-quality reference genome assemblies for primate species using cutting-edge sequencing technologies. We outline here the possible paths going forward and some of the major questions to be addressed within this ambitious project. We anticipate that genomic comparisons, including broader taxon sampling of extant primate species, will significantly contribute to our understanding of the evolution of human phenotypes and diseases and the genomic mechanisms of primate speciation and adaptation, which will ultimately assist in primate conservation efforts. Worldwide, there are currently more than 500 primate species from 80 genera and 16 families, with new primate species still being discovered in recent years (Fan et al., 2017; Nater et al., 2017; Roos et al., 2020). As our closest biological relatives, NHPs hold many clues for understanding the origin and evolution of human complex traits, behaviors, and diseases. NHPs are widely used as biomedical research models for studying the genetic basis of human diseases (e.g., neurodegeneration, diabetes, cancer, and cardiovascular disease), as well as models for reproduction, transplantation, and pharmacology (Rogers & Gibbs, 2014). Furthermore, over the last two years, NHPs have played critical roles in the study of SAR-CoV-2 infection mechanisms and vaccine development (Chandrashekar et al., 2020; Mercado et al., 2020). Therefore, current primate genome sequencing efforts have prioritized the great ape lineage, i.e., taxa closest to humans, and other biomedically relevant species (Carbone et al., 2014; Gibbs et al., 2007; Gordon et al., 2016; Locke et al., 2011; Scally et al., 2012; The Chimpanzee Sequencing and Analysis Consortium, 2005; The Marmoset Genome Sequencing and Analysis Consortium, 2014; Warren et al., 2020), with 90% of primate species yet to be characterized. Due to anthropogenic interference, climate change, and other factors, some 60% of the world’s primate species are threatened with extinction and 75% are experiencing population declines (Estrada et al., 2017), thus highlighting an impending extinction crisis. Therefore, the phylogenomic study of primates is critical for broadening our knowledge on the evolutionary and adaptive history of each species, and for providing relevant information for conservation decisions going forward. Currently, the genomes of a dozen representative NHP species from 22 genera have been published (Ensembl v103), with 72% of genera not yet sequenced. The PGP aims to close this gap and generate long-read-based high-quality reference genomes for at least one representative species in each primate genus in the next few years. Ultimately, we wish to determine the genome sequences for all primate species. These high-quality genomes will provide rapid information for comparisons in the context of primate phylogeny to reconstruct the primate tree of life and clarify the genomic changes underlying the speciation and adaptation processes of major primate lineages. We anticipate that a detailed evolutionary landscape will be disclosed for all genomic variations across primate lineages from chromosomal rearrangements to single base-pair substitutions. This landscape will inform the evolutionary patterns of structural variations, segmental duplications, protein-coding genes, and non-coding regulatory elements. By integrating the metarecords of trait data, genomic information will enable us to
What problem does this paper attempt to address?