VitisGDB: the Multifunctional Database for Grapevine Breeding and Genetics
Xiao Dong,Wei Chen,Zhenchang Liang,Xuzhen Li,Peter Nick,Shanshan Chen,Yang Dong,Shaohua Li,Jun Sheng
DOI: https://doi.org/10.1016/j.molp.2020.05.002
IF: 27.5
2020-01-01
Molecular Plant
Abstract:Grapevine cultivation has been gaining commercial popularity in many parts of the world due to the high yield and versatility of this horticultural crop. A recent survey from the International Organization of Vine and Wine (OIV) estimated that the global area under vine cultivation in 2018 was about 7.4 million hectares and that the world production of grapes was about 77.8 million tons in total (OIV, 2019OIVOIV Statistical Report on World Vitiviniculture.2019http://www.oiv.int/public/medias/6782/oiv-2019-statistical-report-on-world-vitiviniculture.pdf%20Google Scholar). The majority of the global grape yield is used for producing wines, fresh fruit, and raisins, bringing in annual revenue of billions of US dollars (Alston and Sambucci, 2019Alston J.M. Sambucci O. Grapes in the world economy.in: Cantu D. dWalker A. The Grape Genome. Springer Nature, Basel2019: 1-24Crossref Google Scholar). In addition to its economic value, the grapevine is also a useful model for the study of the genetic basis of clonality, fruit development, sex determination, grafting, evolution, and domestication (This et al., 2006This P. Lacombe T. Thomas M.R. Historical origins and genetic diversity of wine grapes.Trends. Genet. 2006; 22: 511-519Abstract Full Text Full Text PDF PubMed Scopus (567) Google Scholar). Furthermore, for many countries in the world traditional viniculture and viticulture are important emblems of cultural identity. All these factors have made grapevine one of the most heavily invested plants in horticultural research. The rise of genome-sequencing technologies has facilitated the release of reference-grade genetic codes and individual-level genetic variations for many grapevine species and cultivars (Canaguier et al., 2017Canaguier A. Grimplet J. Di Gaspero G. Scalabrin S. Duchêne E. Choisne N. Mohellibi N. Guichard C. Rombauts S. Le Clainche I. A new version of the grapevine reference genome assembly (12X. v2) and of its annotation (VCost. v3).Genom. Data. 2017; 14: 56Crossref PubMed Scopus (137) Google Scholar, Zhou et al., 2017Zhou Y. Massonnet M. Sanjak J.S. Cantu D. Gaut B.S. Evolutionary genomics of grape (Vitis vinifera ssp. vinifera) domestication.Proc. Natl. Acad. Sci. 2017; 114: 11715-11720Crossref PubMed Scopus (142) Google Scholar, Roach et al., 2018Roach M.J. Johnson D.L. Bohlmann J. van Vuuren H.J. Jones S.J. Pretorius I.S. Schmidt S.A. Borneman A.R. Population sequencing reveals clonal diversity and ancestral inbreeding in the grapevine cultivar Chardonnay.PLoS Genet. 2018; 14: e1007807Crossref PubMed Scopus (53) Google Scholar, Girollet et al., 2019Girollet N. Rubio B. Lopez-Roques C. Valière S. Ollat N. Bert P.-F. De novo phased assembly of the Vitis riparia grape genome.Sci. Data. 2019; 6: 1-8PubMed Google Scholar, Liang et al., 2019Liang Z. Duan S. Sheng J. Zhu S. Ni X. Shao J. Liu C. Nick P. Du F. Fan P. Whole-genome resequencing of 472 Vitis accessions for grapevine diversity and demographic history analyses.Nat. Commun. 2019; 10: 1-12Crossref PubMed Scopus (84) Google Scholar, Minio et al., 2019Minio A. Massonnet M. Figueroa-Balderas R. Castro A. Cantu D. Diploid genome assembly of the wine grape Carménère.G3 (Bethesda). 2019; 9: 1331-1337Crossref PubMed Scopus (37) Google Scholar, Vondras et al., 2019Vondras A.M. Minio A. Blanco-Ulate B. Figueroa-Balderas R. Penn M.A. Zhou Y. Seymour D. Ye Z. Liang D. Espinoza L.K. et al.The genomic diversification of grapevine clones.BMC Genomics. 2019; 20: 972Crossref PubMed Scopus (42) Google Scholar). Despite the increasing genomic data, a reliable platform for comparing and mining Vitis genomic information is not available. To fill this gap, we have developed VitisGDB, an online genus-level multifunctional genomics database for grapevine (Figure 1 and Supplemental Note; http://vitisgdb.ynau.edu.cn/). VitisGDB aggregates genetic information for 50 out of 60 extant Vitis species, provides the results with visualization of a series of common genetic analyses, and implements easy-to-use bioinformatic tools to enable the investigation of economically important traits for breeding new grapevine cultivars. The framework of VitisGDB was constructed with MySQL, ThinkPHP, and FastAdmin (Figure 1 and Supplemental Note) to allow for easier data organization and a user-friendly interface. Four main modules, namely species, germplasm, phenotype, and gene (Figure 1), were created for the effective categorization and access of aggregated grapevine data. In brief, the species module provides easy retrieval of information for one European Vitis species (two subspecies), 19 North American Vitis Species, 26 East Asia Vitis Species, and three species from other genera (Supplemental Figure 1). The main web page for each Vitis species starts with a species profile information section, which includes Latin name, chromosome number, geographical distribution, and morphological description. A representative picture (if publicly available) is also provided to facilitate taxonomic identification of the species. The second section lists the statistics of all available reference genome assemblies, by which the quality of the assemblies (contig N50, scaffold N50, and BUSCO value) can be compared. The following section details a table of sequenced germplasm with extensive ID information. The final section presents interactive graphs of the phylogenetic tree and the population genetic analyses. The phylogenetic tree shows a clear classification of major grapevine groups, and the accession label shows detailed information for each grapevine. Both the scatterplot of principal component analysis and the bar plot of ADMIXTURE analysis can be zoomed in and out for clarity. The summary statistics of agronomic trait values in the form of box-plot distributions are also presented. Finally, users have access to species-related literature that is periodically updated. The germplasm module includes the passport data, whole-genome sequences, and published phenotypic data for 1641 Vitis accessions, which are reported by various resequencing projects. To resolve the issue that a single cultivar may have different names, the genetic background of each accession was determined using SNP data and cross-verified with the VIVC database. Consequently, accessions with the same genetic background are grouped under the same prime name, whereas 28 accessions that might be misidentified are highlighted with the inferred taxa in the germplasm module and the phylogenetic tree section under species module. The phenotype module indexes numeric values or categorical values for a total of 45 grapevine phenotypic trait data from 1461 accessions. For each trait, the descriptor includes trait name, trait unit, OIV code, scale, and a brief summary of how the trait value was obtained. All phenotypic values are presented in a table with a histogram plot showing their distribution. Gene annotation results for three chromosome-level reference genomes are integrated in the gene module. A total of 104 454 genes are curated. The web page for each gene sequentially lists summary information (gene locus ID, gene symbol, gene type, position, and transcript number). The gene structure can be viewed in an embedded JBrowse. The coding sequence (CDS) of the gene and the amino acid sequence of the protein product are provided. The identified SNPs around and within the gene are also listed to facilitate marker selection for functional verification analysis. The expression level of the gene is presented in a heatmap for easy visualization. In addition to the main modules, VitisGDB contains a total of 25 integrated tools and external databases devoted to Vitis genetic research (Supplemental Figure 2). For instance, the BLAST tool is incorporated into a stand-alone web page, where 19 genome assemblies, seven CDS sequence databases, and seven protein sequence databases are available for query of orthologous gene candidates. The input can be either plain text or a fasta sequence file. The alignment result (available in eight styles) opens up in a new page, detailing the overall alignment score, query length, and similarity between the query and subject sequences. The BLAST result allows secondary filtering, and the final subjects can be downloaded in HTML format. JBrowse is an efficient visualization tool, which facilitates the viewing of gene models, CDS, heterozygous SNPs, and RNA sequencing data, each presented in a different color, in the context of the genomic region. At the moment, all available Vitis genomes and gene models are incorporated into JBrowse. The JavaScript-based tool SynVisio is implemented to show the synteny relationships of three pairs of chromosome-level reference genomes (PN40024 versus Chardonnay, PN40024 versus Vitis riparia, and Chardonnay versus V. riparia). The visualization includes a hive plot indicating synteny between chromosomes, a dot plot indicating collinearity between two species/cultivars, and a scatterplot indicating identified signal strength. The threshold for displaying results in the hive plot and dot plot can be selected by dragging the little circle on the value bar from the min to the max (above the dotted line at the lower left corner). To date, three grapevine genetic maps are available, covering a total of 70 832 marker loci on the genome. For a selected genetic map, a heat plot shows the density of loci along the chromosomes and a corresponding table provides basic information about the mapping population and the genetic map. Double-clicking on the heat plot will zero in on a chromosome of interest on a new web page. The chromosome can be sized with the pointer to view regions in finer detail and show details for each locus with a hyperlink to JBrowse. To allow personalized usage and analyses of the data, we have built a "Download" web page for all datasets available to the public. These include genome assembly sequences, annotation results, and genomic variations in FASTA, GFF, and VCF format, respectively. Considering the large size of the raw data for de novo assembly, resequencing projects, and RNA sequencing, we provide the NCBI BioProject ID and BioSample ID as well as the corresponding links on the web page. We have also imported the metadata for all Vitis-related publications from NCBI into VitisGDB for quick searching. In summary, VitisGDB provides the most comprehensive view of Vitis genomic data to date and will be a valuable platform for studies on Vitis functional genomics and agronomic improvement. With the goal of becoming a community-built platform dedicated to making research results on grapevine broadly available, VitisGDB accepts the submission of all types of grapevine genetic data via the "Submit Data" page. VitisGBD will be continuously updated as genomic data from ongoing sequencing projects become available. New tools and analysis for transposable elements, non-coding RNAs, and environmental data will be added, so that VitisGDB will provide long-term support to the grapevine research community. Yunnan Provincial Key Programs of Yunnan Eco-friendly Food International Cooperation Research Center Project (2019ZG00908). The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Download .pdf (14.03 MB) Help with pdf files Document S1. Supplemental Notes and Supplemental Figures 1 and 2