De novo whole-genome assembly and annotation of a high-quality coffee variety from the primary origin of coffee, Coffea arabica var. Geisha

Juan F. Medrano,Dario Cantu,Andrea Minio,Christian Dreischer,Theodore Gibbons,Jason Chin,Shiyu Chen,Allen Van Deynze,Amanda M Hulse-Kemp
DOI: https://doi.org/10.1101/2024.06.21.600137
2024-06-27
Abstract:Geisha coffee is recognized for its unique aromas and flavors and accordingly, has achieved the highest prices in the specialty coffee markets. We report the development of a chromosome-level, well-annotated, genome assembly of var. Geisha, considered an Ethiopian landrace thatrepresents germplasm from the Ethiopian center of origin of coffee. We used a hybrid assembly approach combining two long-reads single molecule sequencing technologies, Oxford Nanopore and Pacific Biosciences, together with scaffolding with Hi-C libraries. The final assembly is 1.03GB in size with BUSCO assessment of the assembly completeness of 97.7% of single-copy orthologs clusters. RNAseq and IsoSeq data were used as transcriptional experimental evidence for annotation and gene prediction revealing the presence of 47,062 gene loci encompassing 53,273 protein-coding transcripts. Comparison of the assembly to the progenitor subgenomes, separated the set of chromosome sequences inherited from from those of , Corresponding orthologs between Geisha and Red Bourbon had a 99.67% median identity, higher than what we observe with the progenitor assemblies (median 97.28%). Both, Geisha and Red Bourbon contain an inversion on Chromosome 10 relative to the pseudomolecules of the genetic material inherited from the two progenitors that must have happened before the separation in the geographical migration of the two varieties. Lending support of a single allopolyploidization event that gave origin to after the hybridization event with the two progenitor lines. Broadening the availability of high-quality genome assemblies of varieties, paves the way for understanding the evolution and domestication of coffee, as well as the genetic basis and environmental interactions of why a variety like Geisha is capable of producing beans with such exceptional and unique high-quality.
Genomics
What problem does this paper attempt to address?