A haplotype-resolved genome provides insight into allele-specific expression in wild walnut ( Juglans regia L.)

Liqun Han,Xiang Luo,Yu Zhao,Ning Li,Yuhui Xu,Kai Ma
DOI: https://doi.org/10.1038/s41597-024-03096-4
2024-03-09
Scientific Data
Abstract:Wild germplasm resources are crucial for gene mining and molecular breeding because of their special trait performance. Haplotype-resolved genome is an ideal solution for fully understanding the biology of subgenomes in highly heterozygous species. Here, we surveyed the genome of a wild walnut tree from Gongliu County, Xinjiang, China, and generated a haplotype-resolved reference genome of 562.99 Mb (contig N50 = 34.10 Mb) for one haplotype (hap1) and 561.07 Mb (contig N50 = 33.91 Mb) for another haplotype (hap2) using PacBio high-fidelity (HiFi) reads and Hi-C technology. Approximately 527.20 Mb (93.64%) of hap1 and 526.40 Mb (93.82%) of hap2 were assigned to 16 pseudochromosomes. A total of 41039 and 39744 protein-coding gene models were predicted for hap1 and hap2, respectively. Moreover, 123 structural variations (SVs) were identified between the two haplotype genomes. Allele-specific expression genes (ASEGs) that respond to cold stress were ultimately identified. These datasets can be used to study subgenome evolution, for functional elite gene mining and to discover the transcriptional basis of specific traits related to environmental adaptation in wild walnut.
multidisciplinary sciences
What problem does this paper attempt to address?
The problems that this paper attempts to solve mainly focus on the following aspects: 1. **High heterogeneity of the wild walnut genome**: The wild walnut (*Juglans regia L.*) has a relatively high genomic heterogeneity, which makes it difficult for traditional genome assembly methods to obtain a high - quality reference genome. Therefore, researchers hope to better understand the biological characteristics of wild walnuts by constructing a haplotype - resolved genome. 2. **Sub - genome evolution and functional gene mining**: Through the haplotype - resolved genome, the evolutionary process of the sub - genome can be studied more in - depth, and it is helpful to discover functional genes related to specific environmental adaptability. In particular, researchers focus on cold - stress - responsive genes, which are crucial for improving the survival ability of plants in extremely cold environments. 3. **Allele - specific expression (ASE)**: In highly heterogeneous species, the expression patterns of different alleles may vary under specific environmental conditions. Researchers hope to reveal the mechanism of allele - specific expression under cold - stress conditions by analyzing the haplotype - resolved genome of wild walnuts, which is of great significance for understanding how plants adapt to environmental changes through gene expression regulation. 4. **Identification of structural variations (SVs)**: Identifying structural variations, such as inversions, translocations, duplications, and deletions, between two haplotype genomes can provide important information about genome structure and function. These variations may affect gene expression and phenotypic characteristics of plants. In summary, the main objective of this paper is to deeply explore the gene expression regulation mechanism of wild walnuts under cold - stress conditions and the impact of genome structural variations on plant adaptability by constructing a high - quality haplotype - resolved genome. These research results are not only helpful for basic scientific research but also provide important resources for molecular breeding.