A Haplotype-resolved, Chromosome-scale Genome for Malus domestica 'WA 38'

Huiting Zhang,Itsuhiro Ko,Abigail Eaker,Sabrina Haney,Ninh Khuu,Kara Ryan,Aaron Appleby,Brenden Hoffmann,Henry Landis,Kenneth Pierro,Noah Willsea,Heidi Hargarten,Alan Yocca,Alex Harkess,Loren Honaas,Stephen Ficklin
DOI: https://doi.org/10.1101/2024.01.10.574953
2024-05-29
Abstract:Genome sequencing for agriculturally important Rosaceous crops has made rapid progress both in completeness and annotation quality. Whole genome sequence and annotation gives breeders, researchers, and growers information about cultivar specific traits such as fruit quality, disease resistance, and informs strategies to enhance postharvest storage. Here we present a haplotype-phased, chromosomal level genome of Malus domestica, 'WA 38', a new apple cultivar released to market in 2017 as Cosmic Crisp®. Using both short and long read sequencing data with a k-mer based approach, chromosomes originating from each parent were assembled and segregated. This is the first pome fruit genome fully phased into parental haplotypes in which chromosomes from each parent are identified and separated into their unique, respective haplomes. The two haplome assemblies, 'Honeycrisp' originated HapA and 'Enterprise' originated HapB, are about 650 Megabases each, and both have a BUSCO score of 98.7% complete. A total of 53,028 and 54,235 genes were annotated from HapA and HapB, respectively. Additionally, we provide genome-scale comparisons to 'Gala', 'Honeycrisp', and other relevant cultivars highlighting major differences in genome structure and gene family circumscription. This assembly and annotation was done in collaboration with the American Campus Tree Genomes project that includes 'WA 38' (Washington State University), 'd'Anjou' pear (Auburn University), and many more. To ensure transparency, reproducibility, and applicability for any genome project, our genome assembly and annotation workflow is recorded in detail and shared under a public GitLab repository. All software is containerized, offering a simple implementation of the workflow.
Genomics
What problem does this paper attempt to address?
This paper aims to solve the high-precision, single haplotype resolution genome assembly problem of apple (Malus domestica) 'WA 38' (commercially named Cosmic Crisp®). 'WA 38' is a new variety bred by crossing 'Honeycrisp' and 'Enterprise', with excellent storage and disease resistance features, but also some physiological disorders such as greasiness. By using k-mer method with short read and long read sequencing data, researchers first phased the apple genome to the parental haplotypes and identified their unique chromosome compositions. In the paper, the research team constructed two haplotype assemblies named HapA (derived from 'Honeycrisp') and HapB (derived from 'Enterprise'), each about 650 megabase pairs long and with highly complete BUSCO genomes. They also annotated 53,028 (HapA) and 54,235 (HapB) genes, and conducted comparative genomic analysis with other important apple varieties, revealing differences in genome structure and gene families. In addition, the researchers collaborated with students through the Apple Campus Tree Genome Project (ACTG) to develop an open and reusable high-throughput computational workflow to facilitate assembly and annotation of other genome projects. This study not only provides the genetic basis for the characteristics of 'WA 38', but also offers new perspectives for understanding and managing economically important traits of apple crops, such as physiological disorders.