The revised reference genome of the leopard gecko (Eublepharis macularius) provides insight into the considerations of genome phasing and assembly
Y. Chiari,Chase H. Smith,T. Gamble,Shannon E. Keating,Justin C. Havird,Brendan J. Pinto
DOI: https://doi.org/10.1101/2023.01.20.523807
2023-01-20
bioRxiv
Abstract:Genomic resources across squamate reptiles (lizards and snakes) have lagged behind other vertebrate systems and high-quality reference genomes remain scarce. Of the 23 chromosome-scale reference genomes across the order, only 12 of the ~60 squamate families are represented. Within geckos (infraorder Gekkota), a species-rich clade of lizards, chromosome-level genomes are exceptionally sparse representing only two of the seven extant families. Using the latest advances in genome sequencing and assembly methods, we generated one of the highest quality squamate genomes to date for the leopard gecko, Eublepharis macularius (Eublepharidae). We compared this assembly to the previous, short-read only, E. macularius reference genome published in 2016 and examined potential factors within the assembly influencing contiguity of genome assemblies using PacBio HiFi data. Briefly, the read N50 of the PacBio HiFi reads generated for this study was equal to the contig N50 of the previous E. macularius reference genome at 20.4 kilobases. The HiFi reads were assembled into a total of 132 contigs, which was further scaffolded using HiC data into 75 total sequences representing all 19 chromosomes. We identified that 9 of the 19 chromosomes were assembled as single contigs, while the other 10 chromosomes were each scaffolded together from two or more contigs. We qualitatively identified that percent repeat content within a chromosome broadly affects its assembly contiguity prior to scaffolding. This genome assembly signifies a new age for squamate genomics where high-quality reference genomes rivaling some of the best vertebrate genome assemblies can be generated for a fraction previous cost estimates. This new E. macularius reference assembly is available on NCBI at JAOPLA010000000. The genome version and its associated annotations are also available via this Figshare repository https://doi.org/10.6084/m9.figshare.20069273.
Environmental Science,Biology,Medicine
What problem does this paper attempt to address?