GOTaxon: Representing the evolution of biological functions in the Gene Ontology

Haiming Tang,Christopher J Mungall,Huaiyu Mi,Paul D Thomas
DOI: https://doi.org/10.48550/arXiv.1802.06004
2018-02-17
Abstract:The Gene Ontology aims to define the universe of functions known for gene products, at the molecular, cellular and organism levels. While the ontology is designed to cover all aspects of biology in a "species independent manner", the fact remains that many if not most biological functions are restricted in their taxonomic range. This is simply because functions evolve, i.e. like other biological characteristics they are gained and lost over evolutionary time. Here we introduce a general method of representing the evolutionary gain and loss of biological functions within the Gene Ontology. We then apply a variety of techniques, including manual curation, logical reasoning over the ontology structure, and previously published "taxon constraints" to assign evolutionary gain and loss events to the majority of terms in the GO. These gain and loss events now almost triple the number of terms with taxon constraints, and currently cover a total of 76% of GO terms, including 40% of molecular function terms, 78% of cellular component terms, and 89% of biological process terms. Database URL: GOTaxon is freely available at <a class="link-external link-https" href="https://github.com/haimingt/GOTaxonConstraint" rel="external noopener nofollow">this https URL</a>
Populations and Evolution,Genomics
What problem does this paper attempt to address?