Single‐molecule Long‐read Sequencing Reveals Extensive Genomic and Transcriptomic Variation Between Maize and Its Wild Relative Teosinte (zea Mays Ssp. Parviglumis)

Zhao Li,Linqian Han,Zi Luo,Lin Li
DOI: https://doi.org/10.1111/1755-0998.13454
IF: 7.7
2021-01-01
Molecular Ecology Resources
Abstract:Teosinte (Zea mays ssp. parviglumis), the wild progenitor of maize (Zea mays L.), is an important germplasm resource for improvement of modern maize lines. However, we have limited genetic and genomic information about teosinte and lack state-of-the-art tools to annotate transcriptomes assembled by single-molecule long-read sequencing without a reference genome. Here, we employed single-molecule long-read sequencing of cDNA libraries from five tissues of the teosinte inbred line TIL11 and identified 70,044 nonredundant transcript isoforms. We devised a state-of-the-art, machine learning-based bioinformatics pipeline DenovoAS_Finder to annotate the TIL11 transcriptome without a complete reference genome with an accuracy of up to 91%, providing a robust gene classifier of complex genomes. Additionally, we constructed a draft TIL11 genome with 16,633 high-quality contigs and a N50 of 112 kb by Nanopore sequencing. Genes from families that expanded from teosinte to maize were significantly enriched in the gene ontology (GO) term "RNA modification pathway" and had more transcript isoforms in TIL11 than in the maize inbred line B73. Genes showed collinearity between TIL11 and B73, and intergenic regions were extensively altered by transposable elements. Our study furthers the understanding of maize domestication and provides a resource for the utilization of wild germplasm in maize breeding.
What problem does this paper attempt to address?