Diverse Taxonomies for Diverse Chemistries: Enhanced Representation of Natural Product Metabolism in UniProtKB

Marc Feuermann,Emmanuel Boutet,Anne Morgat,Kristian B Axelsen,Parit Bansal,Jerven Bolleman,Edouard de Castro,Elisabeth Coudert,Elisabeth Gasteiger,Sébastien Géhant,Damien Lieberherr,Thierry Lombardot,Teresa B Neto,Ivo Pedruzzi,Sylvain Poux,Monica Pozzato,Nicole Redaschi,Alan Bridge,On Behalf Of The UniProt Consortium,Kristian Axelsen,Teresa Neto,
DOI: https://doi.org/10.3390/metabo11010048
IF: 4.1
2021-01-12
Metabolites
Abstract:The UniProt Knowledgebase UniProtKB is a comprehensive, high-quality, and freely accessible resource of protein sequences and functional annotation that covers genomes and proteomes from tens of thousands of taxa, including a broad range of plants and microorganisms producing natural products of medical, nutritional, and agronomical interest. Here we describe work that enhances the utility of UniProtKB as a support for both the study of natural products and for their discovery. The foundation of this work is an improved representation of natural product metabolism in UniProtKB using Rhea, an expert-curated knowledgebase of biochemical reactions, that is built on the ChEBI (Chemical Entities of Biological Interest) ontology of small molecules. Knowledge of natural products and precursors is captured in ChEBI, enzyme-catalyzed reactions in Rhea, and enzymes in UniProtKB/Swiss-Prot, thereby linking chemical structure data directly to protein knowledge. We provide a practical demonstration of how users can search UniProtKB for protein knowledge relevant to natural products through interactive or programmatic queries using metabolite names and synonyms, chemical identifiers, chemical classes, and chemical structures and show how to federate UniProtKB with other data and knowledge resources and tools using semantic web technologies such as RDF and SPARQL. All UniProtKB data are freely available for download in a broad range of formats for users to further mine or exploit as an annotation source, to enrich other natural product datasets and databases.
biochemistry & molecular biology
What problem does this paper attempt to address?